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Preface 


These  papers  were  presented  during  the  technical  sessions  of  NATO  workshop  IST- 
128  /  RWS-019,  entitled  “Cyber  Attack  Detection,  Forensics  and  Attribution  for 
Assessment  of  Mission  Impact.”  The  first  technical  session  focused  on  the  need  to 
gain  insights  into  the  intent,  motivations,  and  capabilities  of  the  attackers,  in  order 
to  understand  the  intended  and  actual  mission  impact.  The  second  explored 
whether,  and  to  what  extent,  it  is  possible  to  understand  the  mission  impact  by 
analyzing  the  observable  cyber  signal  and  events,  through  such  means  as  are 
normally  associated  with  cyber  intrusion  detection,  forensics,  and  malware 
analysis.  The  third  discussed  the  need  for  models  of  missions  and  systems  that 
support  missions,  and  the  approaches  to  constructing  such  models.  The  fourth 
technical  session  investigated  the  means  by  which  mission  impact  could  be 
simulated  or  modeled.  Additional  details  and  a  report  generated  based  on  the 
discussions  at  the  workshop  are  documented  in  a  separate  publication  titled 
“Assessing  Mission  Impact  of  Cyberattacks:  Report  of  the  NATO  1ST- 128 
Workshop.” 
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Ladies  and  Gentlemen,  distinguished  participants,  and  fellow  presenters: 

Thank  you  for  the  privilege  of  allowing  me  to  pose  and  describe  a  key  cybersecurity  challenge, 
one  that  will  be  important  to  understanding  the  cybersecurity  capabilities  and  intentions  of 
potential  adversaries  and  of  other  powers  that  NATO  must  consider  in  its  force  planning,  political, 
and  operational  deliberations.  This  challenge  encompasses  a  broad  range  of  disciplines,  including 
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potentially  physical  and  behavioral  sciences.  I  describe  this  challenge  as  detecting  reliably  and 
characterizing  accurately  foreign  cyber  weapons  development  and  testing. 

This  challenge  is  important  indeed  to  the  mission  impact  focus  of  this  conference  and  workshop. 
NATO  and  its  members  are  faced  with  rising  concerns  about  the  ability  of  potential  adversaries 
to  pose  both  tactical  and  strategic  challenges  -  to  our  ability  to  conduct  effective  missions,  the 
industrial  capacity  that  sustains  our  military  effectiveness,  and  to  state  sovereignty.  The 
challenges  of  attack  and  exploitation  detection,  attribution,  and  characterization  are  common  at 
all  levels;  these  challenges  must  be  met  to  assess  real  or  intended  effects  on  mission 
effectiveness,  tactically  or  strategically,  as  well  as  to  help  us  understand  what  mission  and 
strategic  responses  are  appropriate  to  respond  to  such  attacks,  and,  if  possible,  to  deter  them. 

My  concern  regarding  this  challenge  was  catalyzed  by  a  question  posed  by  a  related  challenge, 
that  of  deterring  dangerous  offensive  cyber  operations,  either  on  their  own  or  as  part  of  a  general 
attack  against  a  member  nation's  infrastructure,  conducted  at  a  level  that  could  endanger  state 
sovereignty  and  the  lives  of  citizens.  A  Defense  Science  Board  effort  regarding  cyber  deterrence 
underscores  the  need  to  understand  how  cyber  deterrence  might  work,  and  some  of  the 
challenges  that  lie  before  us  if  cyber  deterrence  is  to  become  an  effective  reality.  At  the  same 
time,  the  DSB's  concerns  give  us  the  opportunity  to  study  more  closely  how  adversary  attacks 
work,  what  adversary  intentions  might  be,  how  serious  might  be  their  effects  on  our  own  mission 
capabilities,  including  counter-force  capabilities,  and  what  adjustments  in  our  own  doctrine  and 
operational  concepts  might  be  appropriate,  at  all  levels.  My  own  thinking  about  this  problem 
was  aided  by  a  comparison  of  testing,  attribution,  and  assessment  of  cyber  weapons  with  nuclear 
weapons,  giving  us  an  opportunity  to  compare  an  emerging  domain  with  one  already  well 
established. 

Deterrence  theory  itself  is  highly  developed,  relating  as  it  does  to  nuclear  attack  and  it  might  be 
argued  that  aspects  of  nuclear  detection,  particularly  those  relating  to  detection  and 
transparency,  might  be  useful  to  building  a  cyber  deterrence  paradigm,  or  architecture.  In  brief, 
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nuclear  deterrence  relies  on  several  key  factors,  all  of  which  require  the  ability  to  gain  an  accurate 
of  understanding  of  a  potential  adversary's  capabilities  and  intentions.  These  factors  include:" 

•  A  strong  understanding  of  a  potential  adversary's  geopolitical  interests,  goals,  and 
strategy; 

•  Accurate  characterization  of  the  adversary's  operational  concepts,  as  well  as  its  table  of 
operations  and  equipment; 

•  Continuous  analysis  and  evaluation  of  an  adversary's  capabilities; 

•  The  ability  to  detect  adversary  testing  activities,  and  to  characterize  the  results  of  those 
test; 

•  and  the  ability  to  detect  activities  preparatory  to  an  attack,  as  well  as  to  detect, 
characterize,  and  attribute  actual  attacks. 

Alongside  these  factors  is  the  need  for  a  certain  transparency,  or  the  ability  to  understand  the 
symbolic  language  used  by  an  adversary,  to  detect  changes  in  the  adversary's  intentions,  and  to 
signal  clearly  to  an  adversary  that  those  changes  have  been  detected.  This  need  for  transparency 
emerged  most  clearly  in  1962  when  US  President  John  F.  Kennedy  and  Soviet  Premier  Nikita 
Khrushchev,  and  their  countries,  confronted  each  other  in  a  crisis  that  was  only  resolved  when  a 
communication  channel,  albeit  an  informal  one,  was  established  that  allowed  the  two  leaders  to 
understand  each  other's  intentions  and  actions,  and  to  signal  the  details  necessary  for  a 
resolution  of  the  crisis.  Even  the  movements  of  American  and  Soviet  warships  became  part  of 
this  dialogue,  allowing  the  two  leaders  to  gauge  the  other's  true  intentions  and  limits. 

Our  first  factor,  the  need  for  a  strong  understanding  of  a  potential  adversary's  geopolitical 
interests,  goals,  and  strategy,  as  well  as  the  need  for  transparency,  represent  the  need  to 
understand  human,  political,  and  organizational  behavior.  A  good  deal  of  work  has  been  done, 
particularly  in  the  realm  of  nuclear  deterrence,  in  the  political  and  social  science  disciplines.  This 
work  helps  us  evaluate  continuously  the  behavioral  trends  of  potential  adversaries  in  the  context 
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of  their  broader  strategic  profile,  and  it  plays  an  important  role  in  the  indications  and  warning 
aspect  of  nuclear  deterrence. 

The  challenge  of  nuclear  deterrence  also  led  to  substantial  work  in  the  physical  sciences,  giving 
us  the  means  to  detect  and  characterize  nuclear  weapons  developmental  and  testing  activities, 
although  it  should  be  noted  that  there  is  not  a  unanimity  of  opinion  regarding  the  accuracy  and 
timeliness  of  those  means.  Nonetheless,  there  exists  reasonable  confidence  in  NATO's  ability  to 
detect,  attribute,  and  describe  nuclear  weapons  developmental  and  testing  activities.  The 
Comprehensive  Test  Ban  Treaty,  to  which  many  NATO  countries  subscribe,  relies  on  four  streams 
of  data  to  detect  and  characterize  those  tests,  including  radionuclide,  seismic,  infrasonic,  and 
hydro  acoustic  detection  and  data. 

It  is  true,  however,  and  probably  unfortunate  that  the  level  of  clarity  we  have  been  developed  to 
detect  nuclear  weapons  developmental  and  testing  activities,  to  attribute  these  activities,  and 
possibly  even  to  detect  activities  preparatory  to  an  attack,  has  not  been  achieved,  and  it  is  this 
challenge  I  put  before  our  R&D  community. 

Have  developmental  activities  taken  plan  that  we  have  not  detected  and  characterized?  Such  a 
question  -  the  existence  of  something  that  we  have  been  able  to  postulate  but  not  detect  -  is 
difficult  to  answer.  However,  we  might  ask:  have  tests  of  these  capabilities  taken  place?  In  this 
case,  we  might  speculate  usefully. 

It  is  my  view  that  the  attack  on  Sony  was  likely  a  weapons  test.  The  FBI  has  indicated  with  high 
confidence  its  assessment  that  Democratic  People's  Republic  of  Korea,  or  North  Korea,  was 
responsible.  Why  did  North  Korean  conduct  an  attack  on  Sony?  Was  it  out  of  pique,  stimulated 
by  what  is,  by  all  accounts  a  fairly  wretched  movie?  Or,  did  Pyongyang  seek  to  demonstrate  to 
itself  and  to  others  that  it  poses  the  cyber  capability  to  strike  at  distances  beyond  its  kinetic 
reach?  In  choosing  Sony,  the  US  media  subsidiary  of  a  Japanese  consumer  electronics  company, 
it  engaged  in  a  scenario  unlikely  to  spark  significant  retribution,  even  as  it  demonstrated  its  ability 
to  strike  at  a  prominent  commercial  enterprise  equipped  with  information  technology  in 
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common  use  globally.  It's  useful  to  recall  North  Korea's  sinking  in  2010  with  a  torpedo  a  South 
Korean  frigate.  That  incident  verified  to  North  Korea  to  its  torpedoes  work  and  demonstrated  to 
potential  adversaries  another  North  Korean  capability  they  cannot  afford  to  overlook. 

That  North  Korea  tests  kinetic  weapons  on  live,  adversary  targets  is  something  we  have  seen,  in 
the  case  of  its  test  of  a  torpedo,  and  its  live  shelling  of  South  Korean  villages.  However,  our 
understanding  of  North  Korea's  cyber  operational  concepts,  and  even  our  ability  to  distinguish 
between  may  be  a  "merely"  malicious  action  from  a  test  of  a  capability  that  might  be  used  in  a 
more  disciplined  manner,  is  far  from  well  evolved. 

Perhaps  a  more  troubling  example  is  the  recent  explosion  at  a  German  steel  plant.  This  explosion, 
which  caused  significant  damage,  appears  to  have  been  caused  by  the  introduction  of  malware 
to  the  industrial  control  system  used  in  a  blast  furnace,  malware  that  caused  the  blast  furnace  to 
explode. 

Was  this  a  weapons  test?  If  so,  it  taught  its  perpetrator  quite  a  lot  about  the  vulnerability  of  a 
specific  ICS,  and  without  good  attribution,  the  perpetrator  was  able  to  conduct  this  test  without 
fear  of  significant  consequences.  Perhaps  even  more  serious,  the  lack  of  attribution  means  that 
we  are  not  likely  to  be  able  to  associate  this  act  with  the  actor's  interests,  goals,  and  strategy, 
nor  are  we  able  to  engage  in  the  kind  of  transparent  communication  necessary  to  achieve 
deterrence. 

These  are  troubling  developments,  made  more  troubling  perhaps  by  the  much  lowers  barrier  to 
entry  that  exist  for  cyber  warfare  than  exist  for  nuclear  weapons.  The  development  of  nuclear 
weapons  take  substantial  resources;  even  their  testing  and  evaluation  are  complex  and  resource¬ 
intensive  activities,  allowing  in  most  cases  for  detection  and  attribution.  Is  the  same  true, 
however,  for  cyber  weapons?  Perhaps  not.  In  fact,  probably  not. 

For  the  R&D  community,  much  work  needs  to  be  done. 
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First,  what  do  we  know  about  the  real  effects  of  cyber  weapons  on  the  systems  on  which  we  rely 
for  state  sovereignty?  What  do  we  know  about  the  effects  of  these  weapons  on  the 
infrastructures,  damage  to  which  could  endanger  the  lives  of  our  people? 

Second,  what  are  the  characteristics  of  a  cyber  weapons  test?  What  activities  take  place  in 
preparation  for  such  a  test?  How  are  such  tests  conducted?  How  are  such  tests  evaluated  by 
those  who  conduct  them?  To  what  extent  can  we  detect  such  tests  and  evaluate  their  results  for 
our  own  purposes? 

Third,  are  their  derivative,  or  "knock  on"  effects  of  such  tests  that  could  be  detected?  In  other 
words,  can  we  detect  disturbances  in  power  plants  and  electrical  grids,  gas  pipelines,  steel  blast 
furnaces,  and  other  systems  that  are  indicative  of  a  cyber  weapons  test  or  attack,  even  if  the  test 
or  attack  itself  cannot  be  detected  and  characterized  directly? 

Fourth,  what  are  the  intended  and  actual  effects  of  the  weapons  resulting  from  such  tests  on  our 
own  mission  effectiveness,  as  well  as  on  our  strategic  interests? 

Fifth,  do  we  understand  the  behaviors  of  potential  adversaries  clearly  enough  to  relate  cyber 
weapons  developmental  activities  and  tests  to  their  national  interests,  goals,  and  strategies?  Do 
we  know  what  kind  of  attacks  they  might  be  prepared  to  mount,  what  are  their  behavior  limits, 
and  what  we  could  hold  at  risk  that  they  would  value  enough  to  deter  them  from  launching  an 
attack? 

I  would  argue  that  if  cyber  deterrence  is  important,  then  we  are  as  a  NATO  cyber  community  in 
an  epoch  equivalent  to  the  days  in  1962  before  the  Cuban  Missile  crisis,  an  era  in  which  our 
detection  of  adversary  nuclear  activities  and  our  ability  to  relate  those  activities  to  an  adversary's 
behavior,  interests,  goals,  and  strategy  was  sufficiently  weak  as  to  place  us  under  the  threat  of 
nuclear  combat.  It  is  my  hope,  however,  that  we  won't  need  an  analogous  crisis  to  stir  us  to 
action  to  gain  the  capabilities  necessary  to  detect,  attribute,  and  characterize  cyber  weapons 
tests,  to  use  this  understanding  to  deter  their  use  in  a  way  that  threatens  our  core  interests,  and 
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to  prepare  ourselves  for  a  world  in  which  these  weapons  are  increasingly  common.  In  addition, 
as  our  own  mission  effectiveness  becomes  increasingly  a  function  of  our  ability  to  employ 
complex  information  systems,  our  ability  to  detect,  attribute,  characterize,  and  assess  adversary 
cyber  weapons  test  and  the  employment  of  these  weapons  will  bear  directly  on  our  operational 
outcomes. 

This  is  the  challenge  I  put  before  you.  Solving  it  is  going  to  require  work  in  the  physical  sciences, 
computer  sciences,  and  behavioral  sciences.  It  will  require  our  best  efforts.  First  steps  will 
include  a  recognition  of  the  seriousness  of  this  challenge.  It  will  also  require  closer  work  with 
intelligence  agencies,  helping  them  define  more  closely  their  requirements  for  collection  and 
analysis.  It  will  also  require  us  to  conduct  R&D  that  detects  cyber  weapons  tests  directly,  as  well 
as  the  knock-on  effects  these  test  might  have  on  critical  infrastructures,  command  and  control 
systems,  and  other  systems.  It  will  also  require  us  to  develop  the  means  to  characterize  human 
and  political  behavior  of  potential  adversaries  and  relate  what  we  observe  regarding  cyber 
weapons  tests  to  that  behavior. 

This  won't  be  easy,  but  it  won't  be  impossible. 

Thank  you  again,  ladies  and  gentlemen,  for  your  kind  attention. 

'The  author  is  Senior  Vice  President  and  General  Manager,  Cybersecurity,  ICF  International  and  adjunct  professor  of 
cybersecurity  at  Georgetown  University  (Science  and  Technology  and  International  Affairs  Program  of  the  School  of 
Foreign  Service).  The  author  served  previously  as  Chief  of  Signals  Intelligence  Programs  at  the  National  Security 
Agency. 

11  Notes  describing  differences  between  cyber  and  nuclear  deterrence,  prepared  by  the  author  for  the  Intelligence 
and  National  Security  Alliance  in  support  of  the  DSB  Cyber  Deterrence  Study. 


Cyber  Deterrence  (versus  nuclear  case) 

Characteristic 

Nuclear 

Cyber 

Challenges  and  Comments 

Use  Attribution 

Strong 

Generally  weak 

For  cyber,  progress  is  possible;  some 
progress  has  been  made;  results  remain  uncertain 
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Test  Attribution 


State  of  the  Art 


Developmental 

Intelligence 


Indications  and 
Warning 


Standards  of  conduct 
(and  communication) 
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Strong  Generally  weak  For  nuclear,  there  remains  some  debate 

(e.g.,  verifiability  for  purposes 

of  the  Comprehensive  Test  Ban  Treaty); 

testing  generally  detectable  and  attributable 


Nuclear  weapons  and 
delivery  systems  well 
understood 


For  cyber,  lack  of  consensus  on  characteristics 
of  what  constitutes  a  test. 

Possible  examples:  Sony,  Germany  steel  plant  (explosion 
against  industrial  control  system) 

Evolving;  difficult  to  Range  of  delivery  vehicles 

characterize;  wide  range  of  (including  social  engineering) 
attacks  and  exploits;  wide  complicates  detection  and  attribution 
range  of  delivery  vehicles 


We  seek  to  limit  intelligence  gathered  Ambiguous  US  position 
about  our  own  programs.  We  related  to  preservation  of  our 

respect,  however,  that  deterrence  capabilities 
relies  on  some  transparency. 


Well  developed;  capabilities  continue  No  consensus 
to  develop 


For  cyber,  wide  range  of 
payloads  and  delivery  and 
low  barriers  to  entry  for  smaller 
powers  complicates  l&W 


Post  Cuban  Missile  Crisis  evolution  of  Lack  of  standards  and 
communication  and  confidence-  communication 
building 


Tallinn  manual  provides  first 
look  at  cyber  warfare  codes 
of  conduct;  not  binding 
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Command  and  control  Well  evolved  In  the  US  and  other 
great  powers;  evolving  elsewhere 


Poorly  evolved  globally; 
complicates  detection 
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IST-128  Workshop  on  Cyber  Attack  Detection,  Forensics  and  Attribution  Assessment  of 

Mission  Impact 

Mission  Impact  and  the  Role  of  Behavioral  Science 
Cleotilde  Gonzalez 

Dynamic  Decision  Making  Laboratory 
Department  of  Social  and  Decision  Sciences 
Carnegie  Mellon  University 

5000  Forbes  Ave.  Porter  Hall  208,  Pittsburgh,  PA,  15213 
phone:  412-268-6242 
Fax: 412-268-6938 
E-mail:  coty@cmu.edu 

Cyberspace  is  built  by  humans  to  access  and  share  information  via  information  systems, 
computers,  and  communications  technology.  The  centrality  of  human  behavior  in  the  cyber 
world  is  illustrated  in  the  process  assessment  of  actual  and/or  intended  damage  of  our  missions. 
Defenders  must  be  aware  of  possible  attackers’  behavior  and  strategies,  while  they  must  also 
understand  the  technological  state  of  networks  and  ongoing  activities  in  order  to  prevent  or 
reduce  the  negative  impact  of  cyber  attacks.  In  the  complexity  of  cyberspace  where  the  risks  of 
massive  damage  are  high,  one  cannot  forget  that  the  source  of  all  these  risks  and  complexities  are 
humans.  At  this  stage,  one  should  know  that  arming  humans  with  technology  and  large  amounts 
of  information  will  not  necessarily  result  in  a  more  secure  cyberspace.  We  must  guarantee  an 
integration  of  our  knowledge  of  human  behavior  into  the  development  of  the  wide  range  of 
security  technology  and  information  filtering  efforts.  Here  I  will  discuss  some  challenges  (i.e., 
knowledge  gaps)  in  our  understanding  of  human  behavior  in  the  cyber  world,  which  need  to  be 
addressed  in  future  research  programs.  Then,  I  will  present  an  approach  to  the  integration  of 
computational  representations  of  human  behavior  and  security  systems  and  technology. 
Knowledge  gaps  and  research  challenges 

Cybersecurity  issues  emerge  from  a  collection  of  human  activities:  ill-intentioned  and 
technologically  advanced  actors  in  cyberspace,  ubiquitous  integration  and  reliance  on 
information  and  security  technology  into  societal  and  personal  activities,  and  the  need  for 
protection  and  privacy  on  property  and  assets.  We  need  to  know  more  about  these  human 
activities  and  their  interaction  with  technology. 

In  contrast  to  the  physical  world,  there  are  many  distinct  challenges  to  human  behavior  in 
the  cyber  world.  First,  the  amount  of  data  available  is  unusually  large  and  highly  diverse.  This  is 
due  to  relatively  inexpensive  ways  of  collecting  data  (e.g.,  network  activity)  and  to  the  number 
and  diversity  of  possible  data  sources  (each  network  node  or  piece  of  equipment  can  serve  as  a 
sensor).  Second,  cyber  attacks  can  take  many  forms,  and  each  form  might  target  different  parts  or 
services  in  the  network.  As  such,  an  attack  might  be  represented  in  only  one  data  source  or  in 
combinations  of  several  data  sources,  but  not  in  all  the  data  sources  at  the  same  time  and  in  the 
same  manner.  Thus,  the  defender  needs  to  expend  more  effort  in  searching  for  and  diagnosing 
information  to  achieve  appropriate  defense  strategies.  Third,  the  cyber  world  involves  rapid  and 
constant  change.  In  normal  day-to-day  operation,  changes  like  the  maintenance  of  network 
equipment,  the  addition  of  sub-networks,  and  changes  in  services  or  users  may  be  legitimate 
operations;  however,  they  may  also  resemble  signs  of  an  attack.  Furthermore,  changes  in 
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network  behaviors  can  be  abrupt,  drastic,  and  caused  by  both  internal  and  external  factors.  For 
example,  a  sudden  spike  in  network  activity  on  a  retailer  network  can  be  caused  by  an 
approaching  holiday  (external),  the  retailer  having  a  sale  (internal),  or  a  cyber  attack.  Fourth, 
adequate  human  awareness  highly  depends  on  the  information  coming  from  sensors  (network 
monitoring  equipment,  logs,  etc.).  A  defender  needs  to  constantly  determine  his  trust  in  the 
sensors  and  whether  or  not  to  rely  on  the  information  coming  from  them;  as  it  is  not  possible  to 
directly  evaluate  the  sensors’  reliability.  For  example,  an  attacker  may  first  compromise  sensors 
to  deceive  a  defender  about  the  status  of  the  network  before  and  during  the  attack.  Fifth,  cyber 
attacks  are  adversarial  digital  ways  of  determining  who  gets  power,  wealth,  and  resources.  A 
concept  of  Adversarial  awareness  needs  to  be  developed  to  enhance  the  theory  and  models  of 
theory  of  mind  in  cyber  settings.  In  general,  we  know  little  about  why  humans  behave  unsafely, 
how  humans  might  protect  their  assets  from  attackers,  and  how  network  defenders  may  learn  to 
predict  an  attacker’s  intentions  and  detect  cyber  attacks.  A  program  to  investigate  general  basic 
questions  such  as  how  people  may  learn  to  protect  their  own  goods  while  faced  with  adversaries 
motivated  to  steal  them,  and  how  the  presence  of  technology  and  interactions  with  levels  of 
uncertainty  and  information  may  influence  attack  and  defense  behaviors  may  be  addressed 
through  the  use  of  security  behavioral  game  theory  and  the  application  of  economic  and 
psychological  models  in  the  investigation  of  human  behavior. 

In  summary,  given  the  challenges  of  the  cyber  world  and  their  implications  for  human 
behavior,  a  parallel  research  program  to  investigate  the  development  of  computational 
representations  of  human  behavior  can  be  used  to  address  these  gaps. 

Computational  representations  of  human  behavior  and  their  integration  into  security 
technology 

In  order  to  create  adaptable  technology  that  accounts  for  the  human  behavior,  cognitive 
states,  limitations,  and  biases,  one  needs  to  ultimately  represent  these  processes  in  a 
computational  form.  Theories  of  human  behavior  have  been  translated  at  different  levels  into 
computational  representations.  A  long  tradition  of  this  type  of  research  dates  back  to  the 
beginnings  of  Artificial  Intelligence  and  its  followers,  cognitive  architectures.  Modeling  human 
behavior  in  cyber  security  is  challenging,  given  the  gaps  identified  above,  but  many  efforts  are 
under  development.  For  example,  pattern  recognition  under  uncertainty  represents  a  defender’s 
attempt  to  find  patterns  in  the  attacker’s  action  sequence  to  predict  the  attacker’s  next  operation 
and  to  provide  the  best  response  to  it  However,  if  the  attacker  is  aware  of  these  attempts  to 
detect  sequential  dependencies,  one  possible  path  of  action  is  to  constantly  change  the  malicious 
operations  and  to  exploit  sequential  dependencies.  Cognitive  models  in  ACT-R  (Anderson  and 
Lebiere,  1998,  2003)  and  neural  networks  (West  and  Lebiere,  2001)  are  capable  of  accounting 
for  the  human  ability  to  detect  sequential  dependencies,  and  they  use  the  perceived  sequence  to 
project  the  next  action  that  an  opponent  will  most  likely  take  in  a  strategic  interaction.  Also, 
cognitive  models  derived  from  instance-based  learning  theory  (IBLT)  (Gonzalez  et  al.,  2003),  a 
theory  of  decisions  from  experience  in  dynamic  tasks,  may  be  used  to  create  cognitive  models  of 
the  intrusion  detection  process  (Dutt,  Ahn,  &  Gonzalez,  2011). 

Relatedly,  game  theory  has  been  used  to  model  and  capture  strategies  of  defenders  and 
attackers  in  security  situations  (Pita  et  al.,  2008).  Similarly,  game  theory  has  been  used  for 
decision  making  in  cyber  security  (Alpcan  and  Baar,  2011;  Grossklags  et  al.,  2008;  Lye  and 
Wing,  2005;  Manshaei  et  al.,  2013;  Roy  et  al.,  2010).  However,  most  game-theoretic  approaches 
to  security  have  some  limitations  and  assume  either  static  game  models  or  games  with  perfect  or 
complete  information  (Roy  et  al.,  2010).  To  some  extent,  these  assumptions  misrepresent  the 
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reality  of  the  network  security  context  where  situations  are  highly  dynamic  and  the  decision 
maker  must  rely  on  imperfect  and  incomplete  information.  To  overcome  this,  recent  studies 
attempt  to  account  for  the  bounded  rationality  of  human  actors,  especially  human  adversaries 
(Pita  et  al.,  2012).  However,  this  and  other  game-theoretic  approaches  still  do  not  fully  address 
cognitive  mechanisms  like  memory  and  learning  that  drive  the  human  decision  making  processes 
and  can  provide  a  first-principled  predictive  account  of  human  performance,  including  both 
capabilities  and  suboptimal  biases.  Behavioral  Game  Theory  helps  to  address  some  of  the 
limitations  imposed  by  game-theoretic  approaches  and  examine  how  learning  from  experience 
and  adaptation  to  the  environment  influences  decision  making  and  risk  taking  in  cyber  security 
(Gonzalez,  2013).  Ongoing  efforts  aim  to  scale  up  cognitive  models  to  study  interactions 
between  two  or  more  decision  makers  in  social  conflicts  like  the  Prisoner’s  Dilemma  (Gonzalez 
et  al.,  2014)  and  the  Chicken  Game  (Oltramari  et  al.,  2013).  However,  scaling  up  models  of 
human  cognition  to  cyber  worlds  with  more  than  two  agents  involved  is  still  a  challenge 
(Gonzalez,  2013).  A  key  issue  is  the  need  for  a  better  understanding  of  the  role  of  uncertainty 
and  information  availability  regarding  the  attackers.  Recent  studies  examine  how  the  availability 
of  descriptive  and  experiential  information  influences  interactions  in  social  dilemmas  (Martin  et 
al.  2013;  Oltramari  et  al.  2013).  The  key  findings  of  these  studies  suggest  that  information  is 
needed  for  cooperation  to  emerge,  and  that  lack  of  information  fostered  situations  where  one 
decision  maker  tended  to  exploit  the  other. 

Summary 

Like  many  other  problems  in  our  society  (e.g.,  poverty,  crime,  drug  abuse,  etc.), 
cybersecurity  will  never  be  solved  once  and  for  all.  Instead,  we  should  look  for  strategies  to 
manage  the  problem  in  ways  that  reduce  the  costs,  losses,  and  damage  to  our  missions.  In  this 
position  paper,  I  propose  two  ways  in  which  this  could  be  accomplished.  The  first  one  is  by 
closing  the  knowledge  gaps  in  understanding  human  behavior  through  long-term, 
multidisciplinary  research  programs  that  address  the  many  facets  of  the  mix  between  behavior 
and  technology.  The  second  one  is  through  the  parallel  research  on  the  construction  of 
computational  representations  of  human  behavior,  which  can  result  in  the  successful  integration 
of  these  representations  and  security  technology.  Understanding  and  modeling  human  behaviors 
of  the  attacker  is  tightly  connected  with  the  assessment  of  actual  or  intended  damage  to  our 
missions. 
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Abstract — Assessing  and  understanding  the  impact  of  scattered 
and  widespread  events  onto  a  mission  is  an  ongoing  problem. 
Current  approaches  employ  score-based  algorithms  leading  to 
spurious  results.  This  paper  provides  a  formal,  mathematical 
model  for  mission  impact  assessment.  Based  on  this  model  we 
reduce  mission  impact  assessment  of  widespread  local  events  to  a 
well-understood  mathematical  problem.  Following  a  probabilistic 
approach,  we  present  a  feasible  solution  to  this  problem  and 
evaluate  the  solution  experimentally.  We  put  high  care  in  only 
using  actually  available  data  and  kinds  of  expertise. 


I.  Introduction 

Modeling  dependencies  of  missions  on  various  involved 
resources  is  a  novel  field  of  research,  which  pursues  the  goal 
of  assessing  the  influences  of  local  impacts  on  a  higher  goal, 
i.e.  a  mission.  Early  approaches  use  ad-hoc  methods  for  impact 
assessment  involving  newly  established  algorithms 

In  this  work,  we  take  a  view  from  different  perspectives  to¬ 
wards  mission  impact  assessment.  We  consider  three  views  from 
three  experts  from  different  expertise  and  bring  them  inline 
towards  one  well-defined  mathematical  model.  Based  on  this 
mathematical  model  we  find  a  well-understood  mathematical 
problem:  In  a  complex  dependency  network  we  find  multiple 
widespread  events,  whose  local  effects  must  be  assessed  towards 
a  global  effect.  Using  a  probabilistic  approach,  we  can  benefit 
from  existing,  well-defined  and  well-understood  algorithms  to 
solve  this  problem  without  returning  spurious  results. 

We  focus  on  actual  feasibility  of  data  acquisition  and  keep 
manual  work  to  a  minimum.  We  demonstrate  and  evaluate 
experimentally  that  our  approach  is  of  linear  complexity  with 
the  size  of  application. 

The  rest  of  this  paper  is  structured  as  follows:  In  Sec.  II  we 
develop  a  mathematical  model  for  mission  impact  modeling 
based  on  views  from  different  experts.  Based  on  this  model, 
we  discuss  mission  impact  assessment  as  a  formalized  problem, 
its  theoretical  complexity  and  give  an  experimental  evaluation 
in  Sec.  III.  Being  an  emerging  field  of  research,  we  give  an 
overview  of  related  work  in  Sec.  IV.  Sec.  V  gives  a  conclusion 
and  outlook  to  future  work. 

II.  Dependencies  and  Impacts 

In  the  following,  we  take  a  view  from  different  perspectives 
towards  mission  impact  assessment.  We  consider  three  views 
from  three  experts  from  different  expertise  and  bring  them 
inline  towards  one  well-defined  mathematical  model.  Based  on 


this  mathematical  model  we  find  a  well-understood  mathemat¬ 
ical  problem  that  assesses  a  mission  impact  from  widespread 
local  events. 

Every  expert  defines  a  different  dependency  model,  where 
every  modeled  entity  represents  a  random  variable  and  a 
dependency  between  two  entities  is  represented  by  a  local 
conditional  probability. 

Remark  1.  The  here  presented  approach  was  developed  in  a 
business  focused  use  case.  Instead  of  referring  to  missions,  we 
refer  to  business  processes  in  a  company  and  we  use  both  terms 
interchangeably.  Every  occurrence  of  a  “business”  resource 
should  be  adaptable  to  a  “mission”  resource.  ▲ 


A.  Mission  Dependency  Model  ( Business  View) 

In  the  field  of  business  intelligence,  a  complete  company 
or  organization,  i.e.  a  good  we  aim  to  protect,  is  modeled  as 
a  conglomeration  of  business  processes.  Commonly,  business 
processes  are  modeled  using  the  business  process  modeling 
notation  (BPMN)  and  a  business  process  is  modeled  as  a 
(dependent)  collection  of  tasks.  This  modeling  approach  is 
well  accepted  and  can  be  found,  e.g.  in  [1],  [2],  [3].  Fig.  1 
shows  a  sketch  of  a  BPMN  model  used  throughout  this  paper. 


Figure  1.  Example  BPMN  2.0  model  sketch  for  the  BP\  business  process 
shown  in  the  dependency  model  of  Fig.  2. 

Designing  BPMN  models  is  handled  manually  by  an  expert 
from  a  company  or  by  an  external  business  consultant  having  a 
precise  expertise  in  the  understanding  of  business  analysis.  The 
business  analysis  is  performed  on  a  pure  business  perspective 
and  stops  at  a  “device”  level,  e.g.  it  identifies  a  web-service, 
but  does  not  describe  the  dependencies  of  the  webservice  on 
a  database  or  a  data  center.  This  is  a  reasonable  approach,  as 
the  latter  perspective  comes  from  a  very  different  expertise 
and  would  require  very  broad-range  experts.  Further,  an 
identification  of  a  “web-service”  as  a  business  relevant  object 
is  precise  in  the  terms  of  a  business  perspective,  as,  if  the 
web-service  is  not  running,  the  business  process  might  not  be 
accomplishable.  From  an  “IT”  perspective,  the  web-service 
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might  be  irrelevant,  as  the  crucial  point  of  failure  lies  in  the 
availability  of  data  from  a  database.  The  latter  dependencies 
are  covered  in  the  upcoming  subsection. 

We  extend  [4]  and  we  model  mission  dependencies  as  shown 
in  Fig.  2.  We  model  a  company  as  being  dependent  on  its 
business  processes.  A  business  process  is  again  dependent 
on  one  or  more  business  functions.  Business  devices  provide 
business  function.  Business  devices  are  part  of  the  network 
perspective  and — from  a  network  perspective — might  be  irrele¬ 
vant,  but  were  identified  to  be  business  critical.  Fig.  2  shows 
a  dependency  graph  of  business  relevant  objects,  based  on  the 
preceding  presented  BPMN  model. 

We  model  every  dependency  as  local  conditional  probabili¬ 
ties.  Every  conditional  probability  describes  the  probability  of 
failure  if  a  dependency  fails.  E.g.  the  probability  of  the  business- 
function  BFi  (see  Fig.  2  and  Fig.  1),  e.g.  “provide  access  to 
customer  data”,  failing,  given  the  required  business-device  A, 
e.g.  “customer-data  database”,  is  0.9. 


0.9  0.1  0:6. 
_ ,  ,  'A  f _ _ 


BF1 


BFo 


_  _  _y _ 


\  / ' 
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.  -X _ aT. 
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Figure  2.  Mission  Dependency  Model.  Values  along  edges  denote  individual 
conditional  probability  fragments.  For  our  example  only  the  solid  entities  are 
used.  Consequences  and  further  attributes  are  omitted  in  this  figure. 


Definition  1  (Probabilistic  Preliminaries).  We  represent  every 
node  inside  our  dependency  models  by  a  random  variable,  de¬ 
noted  as  capital  X,  where  every  random  variable  is  assignable 
to  one  of  its  possible  values  x  £  dom(X).  Let  P(X  =  x) 
denote  the  probability  of  random  variable  X  having  x  as  a 
value.  For  our  case  we  consider  dom(X)  =  {true,  false}  and 
we  write  xfor  the  event  X  =  true  and  xfor  X  =  false.  A 

The  event  x  represents  the  case  that  node  X  is  operationally 
impacted  and  ~^x  that  is  is  working  at  its  fully  operational 
capacity,  i.e.  no  impact  is  present. 

Definition  2  (From  Dependencies  to  Distributions).  We  render 
every  dependency  of  random  variable  Y  on  X  as  an  individual 
conditional  probability  p(x\y )  and  p(x\~^y).  Such  individual 
conditional  probability  are  fragments  of  a  complete  conditional 
probability  distribution  and  are  therefore  denoted  in  lower¬ 
case.  To  acquire  the  local  conditional  probability  distribution 
P(X\Y)  of  node  X  from  all  its  individual  dependencies 
p(X\Y)  of  all  dependent  nodes  Y  £  Y,  we  employ  a  non- 
leaky  noisy -or  combination  function  [5],  [6].  Non-leakiness 
implies  p(x\^y)  =  0  for  every  dependency  and  therefore 
P(x\-iy)  =0.  A 


With  Def.  2,  we  obtain  a  Bayesian  network  from  our  mission 
dependency  model,  for  which  we  can  specify  a  joint  probability 
distribution  over  all  entities  in  the  dependency  model  plainly 
as  the  product  of  all  local  conditional  probability  distributions. 

The  example  given  below  shows  how  we  can  use  a  proba¬ 
bilistic  dependency  model  as  a  Bayesian  network  and  how  an 
impact  can  be  assessed. 

Example  1.  Following  the  rather  simple  mission  de¬ 
pendency  model  depicted  in  Fig.  2  ( excluding  dashed 
entities),  we  obtain  the  joint  probability  distribution 
P(CM1,BP1,BF1,BF2,A,B)  as 

=  P(CMi\BPi)  •  P(BP1\BF1,BF2y 

P(BF1\A)  •  P(BF2\A ,  B)  •  P(A)  •  P(B)  ,  (1) 

where  P(BPi\BF\,BF2)  and  P(BF2\A,  B)  are  obtained 
through  the  noisy -or  assumption  from  p(bpi\bfi),  p(bpi\bf2) 
and  p(bf2\a),  p(bf2\b )  respectively. 

We  can  then  marginalize  the  conditional  probability  of  a 
mission  impact  on  CM\  from,  say,  an  observed  impact  on 
A  —  a  and  none  on  B  =  ~^b  as 

P(cmi\a )  =  cu 

EEE  P(cmi,  BPi,  BFi,BF2,a,^b)  ,  (2) 

BP\  BFi  BF2 

with  a  normalizing  factor  a,  s.t.  P{CMi\a)  =  1.  ♦ 

To  detail  an  effect  of  an  impact,  we  define  a  set  of 
consequences  for  every  business  process  to  which  a  possi¬ 
ble  failure  of  the  business  process  might  lead.  Again,  we 
model  a  consequence  as  an  individual  conditional  probability 
stating  the  probability  that  a  consequence  happens,  given 
an  impact  on  the  business  process.  Likewise,  we  can  then 
calculate  the  probability  that  a  BP’s  consequence  happens 
{con bp),  given  all  observed  local  impacts,  say  a,  plainly  as 
P[conBp\a)  =  P{conBp\bp)  •  P(bp\a). 

Still,  only  considering  a  business  view  does  not  cover 
transitively  (or  passively)  involved  resources.  To  cover  distant 
and  widespread  local  events,  which  are  not  directly  obvious, 
we  introduce  a  network  dependency  model  in  the  upcoming 
subsection. 

B.  Network  Dependency  Model  (IT  View) 

As  mentioned  afore,  an  identified  critical  device  might  be 
threatened  transitively  by  further  devices  inside  the  network. 
In  a  network  modeled  by  an  IT  expert  we  cover  dependen¬ 
cies  between  individual  network  nodes,  which  can  be,  e.g., 
individual  ICT  servers,  ICS  devices,  software  components  or 
other  operationally  needed  resources.  We  follow  the  same 
“Bayesian”  approach  as  before,  i.e.  every  dependency  between 
two  devices  represents  a  local  conditional  probability  of  failure, 
if  the  dependence  fails,  as  shown  in  Fig.  3. 

However,  in  contrary  to  the  mission  dependency  model, 
assessing  network  dependencies  might  not  be  manageable 
by  hand.  Complex  network  architectures  render  a  manual 
dependency  analysis  infeasible  and  error  prone.  Further,  new 
dynamically  adjusting  network  architectures  make  it  even 
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unknown  to  an  expert  to  identify  exact  network  dependencies. 
However,  it  is  possible  to  validate  a  presented  network  depen¬ 
dency  model  for  plausibility.  We  therefore  employ  heuristics 
based  on  exchanged  information  amounts,  e.g.  traffic  analyses, 
to  identify  possible  network  dependencies.  As  long  as  a 
network  device  only  consumes  relevant  information  for  its 
purpose,  every  data  transfer  inside  the  network  must  motivate 
some  dependency.  Moreover,  collecting  traffic  information 
about  a  network  is  a  reasonable  and  feasible  effort.  Further, 
under  the  assumption  of  per  node  equally  distributed  entropy 
and  encoding  of  consumed  information,  a  dependency,  i.e. 
a  conditional  probability,  must  be  a  function  of  consumed 
information  bits.  We,  therefore,  reduce  an  infeasible  effort  of 
identifying  all  dependencies  by  hand  onto  finding  a  heuristic, 
or  rather,  checking  a  generated  dependency  model. 

Example  2.  In  our  use  case,  we  have  information  about 
exchanged  information  at  a  logical  ICT  device  level  covering 
virtual  machines  as  individual  devices.  More  granular  data,  e.g. 
on  software  layers,  was  not  acquirable.  Fortunately,  we  can 
assume  in  our  use  case  that  every  device  drives  one  purpose. 
Multiple  software  applications  running  on  one  device  will  most 
likely  be  dependent  on  each  other,  and  a  failure  of  one  software 
component  will  very  likely  lead  to  a  failure  of  other  software 
components.  We  can  say  that  dependencies  at  device  level  are 
coarse  enough. 

For  example,  a  workstation  X  consuming  different  query 
results  from  multiple  databases  will  distribute  gained  and 
processed  information  from  such  queries  to  other  devices. 
The  percentage  of  received  traffic  Tyi:x  from  every  database 
Yi  towards  the  total  received  traffic  can  give  us  a  good 
guideline  for  the  conditional  dependency  between  them  as 
p(x\yi)  =  y^YTy  x  '  H°wever’  tf  the  workstation  further 
consumes  irrelevant  5TB  of  cat  pictures  from  a  local  file  server, 
the  heuristic  will  fail,  because  the  workstation  also  consumed 
many  irrelevant  information.  Depending  on  a  network  or 
company  characteristics  other  heuristics  might  be  appropriate, 
e.g.  derivation  from  a  mean  received  amount  of  data  or  a 
mapping  onto  a  a  distribution.  ♦ 


Figure  3.  Network  Dependency  Model.  Dependencies  between  B,  C  would 
also  be  possible.  Conditional  probability  fragments  are  marked  along  the 
edges.  Grey  nodes  represent  external  shock  events  leading  to  local  impacts. 
The  time-varying  conditional  probability  of  local  impact  given  an  instantiated 
external  shock  event  is  given  below  an  event’s  node.  Connections  to  the  mission 
dependency  model  are  sketched  in  dashed  grey. 


C.  Local  Impacts  (Security  View) 

A  third  view  involves  a  security  expert  able  to  assess  local 
consequences  of  events.  In  the  style  of  reliability  analyses  using 
Bayesian  approaches  we  model  external  shock  events  inside 
a  network.  Every  node  X  might  be  affected  by  one  or  more 
external  shock  events  SE ,  which  are  prior  random  variables. 
An  external  shock  event  SE  £  SE  might  be  present  (se)  or 
not  be  present  (-ise),  for  which  a  prior  random  distribution 
P(SE)  is  defined.  In  the  case  that  an  external  shock  event  is 
present  (se),  there  exists  a  probability  of  it  affecting  a  node  X , 
expressed  as  a  local  conditional  probability  fragment  p(x\se). 
If  an  external  shock  event  exists  and  it  is  not  inhibited,  we 
speak  of  a  local  impact  on  x.  In  the  case  that  the  external  shock 
event  is  not  present,  i.e.  -i se,  it  does  not  affect  random  variable 
x  and  we  write  p(x\->se)  =  0.  Every  individual  conditional 
probability  fragment  from  an  external  shock  event  is  treated 
in  the  same  noisy-or  manner  as  a  dependency  towards  another 
node,  and  thus,  multiple  shock  events  can  affect  one  node.  We 
consider  fully  obseravble  external  shock  events.  Extensions  to 
partially  observable  shock  events  are  straightforward. 

Classically,  a  local  impact  can  also  be  seen  as  an  observation 
of  an  impacted  node,  i.e.  x.  However,  in  a  Bayesian  approach 
this  would  imply  that  this  impact  origins  from  inside  the 
modeled  network  and  would  “blame”  other  nodes  for  it.  By 
introducing  external  shock  events  we  gain  the  ability  to  model 
“soft  evidence”  of  local  impacts,  i.e.  we  are  not  sure  whether 
an  external  shock  event  might  actually  lead  to  a  local-impact 
and  affect  a  node’s  operational  capability. 

Definition  3  (Temporal  Aspects).  We  define  a  temporal  aspect 
of  an  external  shock  event.  We  employ  the  idea  of  abstract 
timeslices  in  which  the  effect  of  an  external  shock  event 
changes.  Every  abstract  time  slice  then  represents  a  duplicate 
of  the  network-  and  mission  dependencies  with  a  different 
set  of  local  conditional  probabilities  of  local  impacts.  We 
denote  time-varying  probabilities  in  a  sequence  notation  as 
(to  :  po, . . .  fir  •  Pt },  where  we  have  T  + 1  abstract  timeslices. 
In  every  abstract  timeslice  i,  varying  local  impacts  take  their 
respective  probability  pi  defined  for  its  time  slice  f.  A 

Every  local  impact  represents  a  potential  threat  and  can 
be,  for  example,  a  consequence  of  a  present  vulnerability,  a 
countermeasure,  an  attack,  or  originate  from  hardware  failure. 
It  lies  in  the  expertise  of  a  security  operator  to  assess  a  potential 
local  impact  of  those  threats.  Note  that  he  does  not  need  to 
have  neither  any  expertise  in  network  dependencies  nor  an 
understanding  of  missions  to  do  so.  The  following  Ex.  3  shows 
an  example  on  how  external  shock  events  can  lead  to  local 
impacts  in  a  security  context  on  selecting  an  adequate  response 
plan  to  an  (ongoing)  attack. 

Example  3  (Response  Plan  Side  Effects).  We  employ  mission 
impact  assessment  to  achieve  a  qualitative  assessment  of 
potential  negative  side  effects  of  a  proposed  response  plan 
to  an  ongoing  or  potential  attack.  We  see  a  response  plan 
as  a  collection  of  individual  actions  affecting  a  network.  E.g., 
a  shutdown  of  a  server  might  easily  reduce  the  surface  of  a 
potential  attack.  Still,  if  a  critical  device  is  highly  dependent 
on  that  server,  it  might  impact  a  mission  even  heavier  than  a 
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potential  attack.  We  consider  three  mitigation- action  types  and 
transform  them  to  external  shock  events,  possibly  leading  to 
local  impacts. 

The  first  mitigation  action,  i.e.  an  external  shock  event,  is 
a  “shutdown”.  Obviously,  if  a  node  is  shut  down  (se:  the 
external  shock  event  is  present )  we  can  easily  say  that  the 
probability  of  local  impact,  given  the  shutdown  of  node  X,  is 
1,  i.e.  p(x\se)  =  1. 

Second,  employing  a  patch  on  a  node  X  might  produce 
collateral  damage  as  well.  During  installation  of  the  patch, 
there  might  be  a  (low)  probability  of  immediate  conflict.  In  a 
mean  time,  a  patch  might  enforce  a  reboot  of  a  device.  This 
leads  to  a  temporal  shutdown  and  might  lead  to  hardware 
failure.  Finally,  after  a  successful  reboot,  a  replacement  of 
hardware,  and/or  a  restore  of  a  previous  backup,  the  device  will 
fully  resume  its  operational  capability.  Using  temporal  aspects, 
we  can  model  a  patching  operation  in  three  abstract  time  slices 
and  define  the  local  impact  probabilities  of  this  external  shock 
event  to  be  p(x\se )  =  (to  :  0.1,  t%  :  1.0,  £2  :  0.0). 

Our  third  considered  mitigation  action  is  the  restriction  of 
a  connection  from  node  X  to  node  Y,  i.e.  a  new  firewall  rule. 
From  a  technical  perspective  this  operation  forbids  a  transfer  of 
data  that  might  have  been  crucial  for  the  operational  capability 
of  a  node  Y.  Therefore,  a  firewall  rule  leads  to  an  operational 
impact  on  Y.  We  must  assess  this  impact  locally.  This  is  a 
special  case  requiring  PearTs  [7]  do-calculus.  As  a  connection 
between  two  devices  resembles  a  dependency,  we  must  further 
actually  remove  this  dependency.  Otherwise,  we  would  infer 
further  impacts  over  a  dependency  that  was  prohibited  and 
already  assessed  locally.  To  do  so,  we  simply  “bend”  the 
forbidden  dependency  to  an  external  shock  event  se,  s.t.  the 
local  conditional  failure  probability  p(y \x)  becomes  a  local 
impact  probability  p(y\se).  Another  approach,  decidable  by  a 
security  operator,  would  be  to  accumulate  dropped  connections 
and  add  an  unified  local  impact  for  them.  ♦ 

III.  Mathematical  Mission  Impact  Assessment 

Informally  speaking,  we  have  a  mission  dependency  network 
and  a  device  dependency  network.  In  the  device  dependency 
network,  some  nodes  are  threatened  by  external  shock  events. 
As  nodes  are  dependent,  a  threatened  node  might  again  threaten 
another  node.  We  say,  a  node  is  threatened  by  an  external 
shock  event  transitively.  This  leads  to  a  “spreading”  of  external 
shock  events.  In  the  end,  there  exists  a  probability  that  even  a 
business  process  or  the  complete  modeled  company  (mission) 
is  threatened  transitively  by  various  external  shock  events.  To 
recall,  to  be  threatened  by  an  external  shock  event  (might)  lead 
to  an  impact;  and  it  is  a  well-defined  problem  of  calculating  this 
“mighfi’-probability  of  being  impacted  due  to  an  external  shock 
event,  which  is  what  we  call  the  mission  impact  assessment. 

Definition  4  (Mission  Impact  Assessment).  The  probability 
of  a  mission  node  MN  being  impacted,  is  defined  as  the 
conditional  probability  of  MN  being  impacted  mn  given  all 
observed  external  shock  events  se  G  se,  i.e.  P(mn\se),  where 
the  effects  of  local  impacts  due  to  se  are  mapped  globally  based 
on  mission- dependency  and  network- dependency  graphs.  ▲ 


Figure  4.  Illustration  of  P  (wYN)  viewed  as  sets.  Overlapping  parts  (filled 
with  patterns)  are  commonly  shared  probabilities  along  one  path  (see  Def.  5) 
are  not  allowed  to  be  counted  twice  (or  even  multiple  times)  when  calculating 

UPKMJV)- 


Given  Def.  4  it  is  the  task  of  mission  impact  assessment  to 
calculate  the  probability  P(mn\se).  To  calculate  this,  multiple 
established  approaches  are  available.  From  a  probabilistic  graph 
view,  we  need  a  sound  definition  of  an  overall  joint  probability 
distribution  as  demonstrated  in  Ex.  1.  This  is  well  defined  for 
the  mission  dependency  graph,  because  it  is  a  directed  acyclic 
graph  and  is  a  Bayesian  network.  However,  in  the  network 
dependency  graph  we  cannot  assume  an  acyclicity  constraint 
and  a  joint  probability  distribution  is  not  defined  for  cyclic 
graphs.  We  could  therefore,  transform  the  network  dependency 
model  into  a  dynamic  Bayesian  network  and  perform  a  filtering 
operation  on  it.  However,  this  introduces  a  high  modeling 
overhead.  Further,  we  could  see  the  network  dependency  graph 
as  a  Markov  random  network,  which,  however,  due  to  a  needed 
global  normalization  factor,  destroys  an  intended  local  view 
on  probabilities.  Due  to  the  employed  noisy-or  assumption, 
we  can  view  the  graph  and  problem  as  a  probabilistic  logic 
program  determining  the  probability  of  connectivity  between  a 
mission  node  and  external  shock  events.  This  is  a  probabilistic 
path  search. 

This  means  to  calculate  the  conditional  probability  of 
P(mn\se),  every  path  WMN 

from  an  external  shock  event 
se  G  se  to  the  mission  node  MN  is  a  chain  of  probabilities 
and  is  sufficient  to  induce  {MN  =  true}  =  mn.  Every  path 
exists  with  a  probability  P(wf[N ),  where  P(wfIN)  is  the 
product  of  all  probabilities  in  this  path.  Let  P  KMJV)  denote  the 
probability  viewed  as  a  set.  P(mn\se)  is  then  the  probability 
that  at  least  one  path  exists.  I.e. 

P(mn\se)  =  P(\J  i^MJV)  =  |JP(w;,Miv)  ,  (3) 

i  i 

where  not  all  P (w^N)  are  disjoint  (see  Fig.  4)  and  it  is 
worth  noting  that  all  paths  might  share  common  “edges”.  As 
every  edge  represents  a  probability,  plain  summation  would 
double  count  these  probabilities  and  lead  to  spurious  results. 
This  is  exactly  the  issue  from  which  many  fudge-factor  based 
“propagation”  algorithms  in  ad-hoc  solutions  suffer. 

An  exact  calculation  of  [jiP(wfdN )  is  possible  by  the 
inclusion  and  exclusion  principle  and  the  Sylvester-Poincare 
equality.  Still,  calculation  is  exponentially  hard  due  to  the 
subtraction  of  all  overlapping  sets  and  is  therefore  not  practical. 
We  therefore  approximate  the  result  by  the  use  of  a  Monte- 
Carlo  simulation. 
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A.  Monte-Carlo  Approximation 

In  order  to  approximate  P(mn\se)  we  employ  a  two-step 
simulation.  As  discussed  in  this  section  we  calculate  the 
conditional  probability  through  a  probabilistic  path  search. 
We  first  acquire  all  paths  leading  to  external  shock  events,  for 
every  mission  node  for  which  we  would  like  to  perform  mission 
impact  assessment.  Often,  we  would  like  to  perform  this  for 
every  node  in  the  mission  dependency  model.  Finding  paths 
for  a  node  in  the  mission  dependency  graph  is  trivial,  given 
found  paths  from  business  devices  to  external  shock  events.  We 
therefore,  as  step  one,  acquire  (all)  paths  leading  to  evidence  for 
all  business  devices,  which  is  a  classic  graph  search  problem. 
Under  the  assumption  that  the  number  of  business  devices  and 
external  shock  events  is  comparably  small  to  all  nodes  in  the 
network  graph,  a  depth-limited  search  is  a  reasonable  approach 
for  finding  paths  leading  to  external  shock  events. 

Definition  5  (Probabilistic  Paths).  For  every  business  device 
BDi  G  BD  let  wBDi  denote  the  set  of  all  paths  leading  to 
an  external  shock  event  and  let  wBDi  denote  the  jth  path.  Let 

->  ^  BD 

w  denote  the  super-set  of  all  found  paths.  Every  path  w-  % 
is  a  set  of  individual  conditional  probability  fragments  p(x\y), 
representing  an  edge,  i.e.  a  dependency,  from  y  to  x.  The 
product  of  all  probability  fragments  p(x\y )  G  wBDi  is  the 
exist-prob ability  of  a  path  P[yo-  i).  Every  path  wk  1  for 
which  holds  3j  :  wBDi  C  wBDi  is  irrelevant  for  calculation 
and  w  is  a  finite  set.  Informally  this  means,  during  path  search 
along  one  path  an  already  visited  node  must  not  be  visited 
again  and  we  cannot  get  stuck  in  infinite  loops.  k 

After  acquiring  all  paths  w  leading  to  all  business  devices, 
subsequent  paths  leading  to  business  functions,  processes  and 
the  company  are  trivially  acquired  by  following  the  paths 
leading  to  all  children. 

Step  two  is  a  Monte-Carlo  simulation  to  approximate 
P(V  wBDi)  for  every  business  device  BDi  £  BD.  We  draw 
a  sample  from  w  and  from  all  dependencies  in  the  mission 
dependency  model.  We  check  for  every  BDi  the  satisfaction  of 
V  wBDi  and  mark  the  satisfaction  result  on  BDi.  Subsequently 
we  check  for  satisfaction  of  any  children,  i.e.  dependencies,  of 
every  node  in  the  mission  dependency  model.  Every  satisfaction 
for  a  mission  node  MN  found  in  the  mission  dependency 
model  is  marked  as  a  hit  in  HUmn-  After  nrou  times,  the 
desired  conditional  probability  of  MN  being  impacted  (ran), 
i.e.  the  mission  impact ,  given  all  external  shock  events  se  G  se 
is  approximated  by  P(mn\se)  =  • 

Remark  2  (Path  Check).  Checking  all  paths  during  one 
Monte-Carlo  round  is  highly  optimizable.  wBDi  can  be  sorted 
descending  by  P(wBDi),  s.t.  most  likely  existing  paths  are 
checked  first  and  subsequent  checks  can  be  skipped  once  a 
path  is  found.  Further,  a  path  vuBDi  can  be  sorted  ascending 
by  its  individual  local  conditional  probability  fragments,  s.t. 
most  unlikely  random  variables  are  checked  first  and  further 
checks  inside  one  path  can  be  skipped.  Notwithstanding,  the 
complete  process  is  highly  parallelizable.  k 

Remark  3  (Temporal  Aspects  Implementation).  We  introduced 
that  evidence,  i.e.  an  external  shock  event,  can  have  different 


conditional  local  probabilities  depending  on  an  abstract  time 
slice.  This  means  we  have  a  varying  probability  at  the  end 
of  one  path  coBDi.  Naively,  we  could  perform  a  Monte-Carlo 
simulation  for  every  abstract  time  slice.  However,  this  would 
redundantly  simulate  all  non-varying  probabilities.  We  therefore 
partition  voBDi  in  a  non-varying  set  of  conditional  probabilities, 
i.e.  a  network  path  leading  to  an  impacted  node,  and  a  set  of 
varying  conditional  probabilities,  i.e.  a  set  of  local  impacts,  k 

The  following  example  gives  a  short  demonstration  of 
mission  impact  assessment  using  the  defined  mathematical 
model  using  an  approximate  Monte-Carlo  method. 

Example  4.  Let  us  consider  Fig.  3,  where  an  identified  mission 
critical  device  A  (compare  Fig.  2)  is  threatened  (transitively)  by 
local  impacts  on  nodes  B  and  C.  Let  us  call  the  local  impacts 
SEb  and  SEc .  Let  us  exclude  the  dependency  of  BF^  on  B 
and  temporal  aspects  for  brevity.  Through  depth-first  search 
we  find  the  paths  wA  as 

wo  =  {p(a\b),p(b\seB)} 

Wi  =  {p(a\b) ,  p(b\d) ,  p(d\c)p(c\sec)} 
w2  =  {p(a\c),P(c\sec)} 

Wz  =  {p(a\c),p(c\d),p(d\b),p(b\seB)} 

Additional  paths,  e.g.  =  {p(a\b) ,  p(b\c) ,  p(c\b) ,  p(b\seB)}, 
are  redundant,  as,  here,  wA  is  always  (already)  satisfied,  if 
is  satisfied.  After  finding  these  paths,  finding  paths  to  higher 
nodes  in  a  mission  dependency  model,  say,  to  BF\,  is  trivial, 
by  simply  appending  p(bf\\a)  to  every  path  of  A.  Subsequently, 
the  same  holds  for  BP\  and  CM\. 

For  the  simulation,  at  first  every  used  random  variable 
is  sampled.  Let  RV  be  the  vector  of  all  random  variables 
included  in  all  paths.  I.e.  RV  =  (p(a|6),p(6|ses),p(6|d), 
p(d\c),p(c\sec),p(a\c),p(c\d),p(d\b),p(bf1\a),p(bf2\a), 
P(bpi\bfi),p(bp2\bf2),p(cmi\bpi)).  Let  fv  denote  a  sample 
of  RV ,  say,  fv  = 

where  +  represents  a  true  sample,  and  —  a  false  sample. 

Subsequently,  for  every  identified  critical  device,  i.e.  A,  we 
check  if  at  least  one  of  its  path  is  satisfied,  i.e.  if  \J  wA  is 
satisfied.  We  obtain  that  wA  is  satisfied  and  this  satisfies  A. 
The  circumstance  that  wA  is  also  satisfied,  but  wA  and  wA  are 
not  satisfied  is  irrelevant  and  further  checks  can  be  skipped. 
Subsequently,  we  can  check  the  remaining  mission  dependency 
graph  for  further  satisfactions  in  this  sampling  round.  As  A 
and  p(bfi\a )  are  satisfied,  BF\  is  satisfied  as  well.  The  same 
holds  for  BF2.  Likewise,  BP\  is  satisfied  as  well  as  CM\. 
Every  satisfaction  is  marked  as  a  successful  Monte-Carlo  round 
and  increments  a  mission  node's  MN  hit  counter  hitMN- 

This  procedure  is  repeated  nrou  times,  i.e.  fv  is  sampled  and 
w  is  checked.  Finally,  every  operational  impact  assessment  of  a 
mission  node  MN,  represented  by  the  conditional  probability 

P(mn\seci  ses),  is  approximated  by  P(mn\sec,  ses)  = 
HUmn  ▲ 

nroll  '  T 

B.  Complexity  and  Experimental  Evaluation 

We  implemented  the  proposed  approach  as  a  flexible 
framework  allowing  user  defined  definitions  of  local  impacts, 
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user  defined  heuristics  for  dependency  approaches  and  user 
defined  performance  characteristics  as  defined  below.  As  central 
theme,  we  focused  on  actual  feasibility  of  our  proposal  and  we 
demonstrate  that  our  approach  scales  well,  i.e.  linear,  with  a 
graph’s  complexity.  In  the  following,  we  give  a  short  expected 
summary  of  the  complexity  of  our  approach  and  evaluate  it 
experimentally. 

Evaluation  and  demonstration  of  the  computational  complex¬ 
ity  of  our  presented  approach  is  difficult,  as  it  depends  on 
the  graph  structure  of  the  network  and  the  processed  response 
plan.  We  therefore  use  random  graphs  containing  n^  nodes 
and  np  —  n%  *  0.1  edges  while  assuring  that  every  node  is  at 
least  bidirected.  By  doing  so  we  obtain  a  fully  connected  graph 
with,  approximately,  a  10%  chance  of  two  nodes  being  directly 
connected.  The  processed  response  plan  consists  of  uma 
randomly  placed  mitigation  actions  (external  shock  events).  For 
evaluation  we  place  rather  many  tima  =  •  0.1  mitigation 

actions,  i.e.,  10%  of  all  nodes  are  possibly  impacted.  We 
measure  the  time  tps  required  for  finding  all  np  paths  up  to 
depth  dmax ,  and  tSim  required  for  simulating  all  found  paths 
nrou  times.  Every  measurement  is  repeated  in  50  different 
random  graphs. 

Complexity  is  differentiated  between  both  steps  of  the  impact 
assessment.  Given  a  constant  maximum  search  depth  dmax, 
depth-limited  search  (DLS)  scales  linearly  with  the  number 
of  edges  np,  as  also  experimentally  evaluated  in  Fig.  5. 
Further,  DLS  scales  slightly  with  the  number  of  placed  local 
impacts  uma  (compare  Fig.  10),  as  a  pre-computation  of 
shortest  distances  to  local  impacts  per  node  can  eliminate 
dead  ends  early.  We  write,  path  search  is  a  function  as 
tps  —  f(^E  5  dmax ,  71 M a)  • 

Remark  4  (DLS).  Obviously ,  DLS  scales  exponentially  with 
a  specified  maximum  depth  dmax-  In  general  and  for  our 
example,  the  maximum  depth  should  be  chosen  in  the  range 
of  the  average  path  length  inside  a  given  graph,  s.t.  almost 
every  node  is  considered  at  least  once.  In  order  to  better  scale 
a  maximum  depth  it  is  reasonable  to  allow  a  rational  dmax> 
where  a  depth  ddec  <  1  resorts  to  the  best  ddec%,  i-e-  most 
dependent,  children.  A 

Monte-Carlo  simulation,  i.e.  checking  of  paths,  scales 
linearly  with  the  number  of  found  paths  np  (compare  Fig.  6 
and  7)  and  the  number  of  Monte-Carlo  rounds  nrous  (compare 
Fig.  8),  i.e.  tSim  =  f(np,nrou).  Naturally,  the  number  of 
proofs  np  scales  with  the  number  of  local  impacts  nMA ,  of 
edges  np  and  the  maximum  path  length  dmax  (compare  Fig.  9), 
i.e.  np  =  f  (nMA,  dmax). 

In  summary,  we  conclude  that  experimental  results  match 
theoretically  expected  complexities. 

IV.  Related  Work 

Mission  modeling  and  mission  impact  assessment  is  an 
emerging  held  of  research;  and,  naturally  in  new,  viral  research 
areas,  employ  ad-hoc  solutions  using  algorithms  involving 
fudge  factors.  While  delivering  early  results  and  acclaimed 
solutions  for  mission  impact  assessment,  a  formal  definition  of 
an  underlying  problem  is  yet  missing.  Employed  fudge  factors 


tps  =  f{nE) 


Figure  5.  Pathsearch  is  linear  with  the  number  of  edges  in  the  graph.  Black: 
dmax  —  3,  Blue:  dmax  =  4.  tin  is  linearly  increased,  meaning  a  quadratic 
increase  of  edges.  tps  =  f(nE,dmax,nMA),  nMA  only  very  slightly.  tps 
in  ms. 

tsim  —  f  ('ttp) 


Figure  6.  Simulation  time  is  linear  with  the  found  paths,  dmax  =  3. 
is  linearly  increased,  meaning  a  linear  increase  of  mitigation  actions  and  a 
quadratic  increase  of  edges,  both  increasing  the  amount  of  findable  paths. 
tsim  in  ms. 

in  newly  established  algorithms  lead  to  untraceable  and  spurious 
results  demanding  data  driven  validations.  Unfortunately,  large, 
standardized  datasets  for  validation  are  yet  missing  for  mission 
impact  assessment  and  in  the  following  presented  work.  In  the 
following,  we  point  out  valuable  approaches  and  ideas. 

Barreto  et  al.  [1]  introduce  a  well-understood  modeling 
technique  and  use  BPMN  models  to  acquire  knowledge.  An 
impact  assessment  is  based  on  various  indexes  and  numerical 
scores,  such  as  exploit  index,  impact  factor,  infrastructure 
capacity  index,  and  graph  distances.  Albanese  et  al.  present  in 
[2]  a  well-modeled  formalism  for  complex  inter-dependencies 
of  missions  as  a  set  of  tasks.  Using  numerical  scores  and 
tolerances  in  a  holistic  approach  Albanese  et  al.  focus  on 
cost  minimization.  Buckshaw  et  al.  [8]  propose  a  quantitative 
risk  management  by  involving  various  experts  and  present 
a  score-based  assessment  based  on  individual  values  and  a 
standardization  using  a  weighted  sum. 

Jacobson  [4]  presents  a  well  understood  conceptual  frame¬ 
work  using  interdependencies  based  on  operational  capacity. 
In  this  dependency  model,  impacts  are  propagated  and  reduce 
the  operational  capacity.  [4]  uses  self-defined  metrics  for 
propagating  impacts  through  Boolean  gates. 

Further  works  focused  solely  on  modeling.  E.g.,  Goodall 
et  al.  [9]  focus  on  modeling  and  available  data  integration 
using  ontologies  but  do  not  address  an  impact  assessment. 
Another  ontology-based  approach  is  presented  by  D’Amico 
et  al.  in  [10]  and  identifies  multiple  experts  while  noting  that, 
e.g.,  system  administrators  are  not  capable  of  understanding 
an  organization’s  missions. 
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Figure  7.  Same  simulation  as  Fig.  6,  but  for  dmax  =  4.  Linear  time  complexity 
is  also  achieved  for  very  large  proof  sets. 


Figure  8.  Simulation  time  directly  correlates  with  the  number  of  rolls,  n jy  = 
500,  uma  =  50,  dmax  =  4.  Around  12000  proofs  are  found  each  time. 
nrou  is  increased  exponentially  and  every  measurement  is  repeated  for  50 
different  random  graphs.  tsim  in  ms. 


Figure  9.  Simulation  time  directly  correlates  with  the  number  of  mitigation 
actions.  Constant  tin  =  1000,  dmax  =  3,  nrou  =  1000.  uma  is  increased 
exponentially  and  repeated  in  50  different  random  graphs.  The  more  MAs,  the 
more  paths,  the  longer  it  takes.  tsim  in  ms. 


Figure  10.  Proofsearch  time  is  negligible  dependent  of  the  number  of  mitigation 
actions.  tps  correlates  with  the  number  of  edges  in  the  graph.  Measurements 
collected  during  Fig.  9.  tps  in  ms. 


Notwithstanding,  we  were  inspired  by  several  aforemen¬ 
tioned  modeling  ideas,  such  as  using  the  BPMN  standard  and 
we  considered  different  views  from  various  experts.  To  the 
best  of  our  knowledge,  we  contribute  a  novel,  formalized, 
mathematical  mission  impact  assessment  to  this  emerging 
research  area. 

V.  Conclusion 

We  presented  a  well-defined  mathematical  mission  impact 
assessment,  based  on  a  probabilistic  approach,  without  intro¬ 
ducing  score-based  propagation  algorithms  returning  spurious 
results. 

We  relied  on  the  expertise  of  different  experts  and  merged 
all  views  without  losing  information  or  forcing  an  expert 
into  a  knowledge  field  he  cannot  understand.  Based  on  an 
established  mathematical  model,  we  reduced  mission  impact 
assessment  onto  an  well-understood  problem  in  computer 
science.  Experimental  results  demonstrate  scalability  of  the 
approach  such  that  large-scale  network  scenarios  can  be  handed. 

Future  work  is  dedicated  to  integrating  the  presented  mission 
impact  assessment  into  a  fully  automated  cyber-defense  system. 
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ABSTRACT 

It  is  a  common  scenario  that  the  primary  piece  of  evidence  of  an  attack  is  the  malicious  software  ( malware )  used 
to  perpetrate  it.  Analyzing  and  characterizing  malware  requires  reverse  engineering,  a  manually  intensive  and 
time-consuming  process,  whose  learning  curve  is  quite  steep  and  is  hindered  when  anti-reverse  engineering 
techniques  are  used.  Due  to  the  high  technical  sophistication  required  for  building  advanced  and  stealthy 
persistent  malware,  it  is  quite  common  that  code  fragments  are  reused,  either  from  other  malware,  or  even  from 
legitimate  sources,  such  as  open  source  code  repositories.  The  analysts  should  thus  take  advantage  of  the  code 
reuse  in  the  production  of  malware  to  accelerate  the  reverse  engineering  process.  This  paper  presents  two 
assembly  code  analysis  techniques,  supported  by  prototypes,  towards  this  goal,  namely  assembly  to  source  code 
matching  and  assembly  code  clone  search.  Using  the  Citadel  and  Zeus  malware  as  a  case  study,  these  two 
techniques  help  to  reduce  the  number  of  functions  which  should  be  manually  analyzed  by  a  reverse  engineer. 
The  results  prove  that  the  approach  is  promising  and  is  applicable  to  other  malware  analysis  scenarios. 


1.0  INTRODUCTION 

Malicious  software  (malware)  presents  a  direct  threat  to  military  operations.  With  the  growing  dependence  on 
communications  and  information  systems  to  support  military  missions,  malware  has  the  capacity  to  impact  the 
availability  of  critical  system’s  assets  and  data,  making  missions  vulnerable  to  cyber  threats.  One  example  is  the 
malware  which  infected  in  201 1  the  cockpits  of  America’s  Predator  and  Reaper  drones  at  the  Creech  Air  Force 
Base  in  Nevada.  This  malware  logged  the  pilots’  every  keystroke  as  they  remotely  fly  missions  over  Afghanistan 
and  other  warzones  [1].  It  is  only  by  dissecting  malware  to  understand  how  it  works  through  advanced  analysis 
techniques  that  it  can  be  defeated  or  eliminated.  If  not,  then  the  malware  can  resist  multiple  removal  efforts  and 
re-infect  the  systems,  such  as  in  the  case  of  Creech’s  computers  [1]. 

Performing  in-depth  analysis  of  malware  necessitates  reverse  engineering,  which  is  not  a  simple  learning 
endeavour  [2].  The  learning  process  is  quite  involved,  as  it  requires  knowledge  from  several  disparate  domains, 
such  as  computer  architecture,  systems  programming,  operating  systems,  and  compilers  [2].  Although  software 
reverse  engineering  came  to  an  age  in  1990  [3],  it  is  only  in  the  last  decade  that  its  importance  and  visibility  have 
arisen.  However,  despite  a  growing  community,  it  is  still  perceived  by  many  as  a  dark  art.  This  paper  presents 
two  techniques,  together  with  the  prototypes  implementing  them  which,  by  leveraging  open  source  code 
repositories  publicly  available  and  assembly  code  fragments  previously  analyzed,  reduce  the  entry  barrier  new 
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analysts  face,  as  well  as  enhance  and  accelerate  the  reverse  engineering  process.  The  remainder  of  this  paper  is 
organized  as  follows:  Section  2  presents  the  assembly  to  source  code  matching  technique  and  its  associated 
prototype.  Section  3  provides  background  information  on  code  clone  detection,  the  research  area  on  which  the 
identification  of  reused  code  fragments  is  built,  the  different  types  of  assembly  code  clones  which  should  be 
detected  by  the  proposed  technique,  as  well  as  the  prototype  supporting  it.  Section  4  describes  the  results  of  a 
case  study  using  the  two  techniques  to  analyze  the  Citadel  and  Zeus  malware.  Finally,  Section  5  concludes  the 
paper. 


2.0  ASSEMBLY  TO  SOURCE  CODE  MATCHING 

Malware  authors  reuse  code  from  all  sources  including  legitimate  ones,  such  as  open  source  code.  For  example, 
both  the  Conficker  worm  and  the  Waledac  bot  used  an  open  source  implementation  for  their  cryptographic 
functions  [4,  5].  In  the  case  of  the  Waledac  bot,  this  represented  25%  of  its  code  [5].  Also,  the  much-discussed 
FLAME  malware  contained  publicly  available  open  source  code  packages  such  as  SQLite  and  Lua  [6]. 

Another  example  of  source  code  reuse  for  malware  creation  is  the  Citadel  Trojan.  Citadel  is  based  on  the  leaked 
source  code  of  the  Zeus  crimeware  kit  [7].  Other  malware  (e.g.,  Gameover  Zeus,  Ice  IX,  LICAT,  and  Murofet) 
were  also  created  after  the  source  code  of  Zeus  was  made  public  [7]. 

2.1  Objective 

Some  people  have  compared  reverse  engineering  to  solving  a  jigsaw  puzzle  [8].  You  first  start  by  finding  the 
comer  pieces,  then  the  frame,  and  after  that,  you  work  your  way  forward  from  there.  Using  this  analogy,  the 
comer  pieces  for  reverse  engineering  are  strings,  constants,  and  function  names.  Strings  contain  human  readable 
hints  about  a  given  functionality.  Specific  constants  can  give  additional  clues  and  can  sometimes  be  used  to  even 
identify  certain  types  of  algorithms.  Function  names  of  imported  functions  from  shared  libraries  (e.g.,  DLL)  can 
reveal  information  about  the  performed  actions.  However,  a  lot  of  experience  is  needed  to  know,  for  example, 
the  significance  of  a  constant  in  a  given  context  or  what  the  combination  of  imported  functions  might  results  in. 

The  meaning  of  strings,  constants,  and  function  names  could  be  obtained  by  searching  them  on  publicly 
available  open  source  code  repositories.  This  was  the  idea  behind  the  RE-Google  IDA  Pro  plug-in  [8],  which  has 
proved  to  be  very  valuable  to  find  algorithms  and  code  excerpts  containing  such  information  on  Google  Code. 
The  ability  to  efficiently  recognize  the  open  source  origin  for  a  given  assembly  code  fragment  is  desirable,  both 
in  order  to  enhance  the  productivity  of  a  reverse  engineer,  as  well  as  to  reduce  the  odds  of  common  libraries 
leading  to  false  correlation  between  otherwise  unrelated  code  bases. 

RE-Google  [8],  a  proof  of  concept  plug-in  for  the  IDA  Pro  disassembler  [9],  extracts  constants,  library  names, 
and  strings  contained  in  a  disassembled  binary,  and  uses  them  to  search  for  code  on  Google  Code  [10].  The  links 
to  the  top  ten  source  code  files  found  are  inserted  as  comments  into  the  assembly  code  listing.  Reviewing  these 
files  frequently  provides  enlightening  insights  into  the  functionality  of  the  code  fragment  in  question  and  saves 
considerable  time.  However,  RE-Google  uses  the  Google  Code  Search  Data  Application  Programming  Interface 
(API),  which  is  no  longer  available,  making  this  plug-in  non-functional.  Furthermore,  Google  Code  will  be  shut 
down  in  January  2016. 

2.2  BinSourcerer  Prototype 

In  order  to  be  able  to  automatically  match  assembly  with  source  code,  the  following  alternatives  to  the  Google 
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Search  service  were  evaluated:  Antelink  [11],  CodePlex  [12],  GrepCode  [13],  GitHub  [14],  Krugle  [15],  the 
Open  Hub  Code  Search  [16],  and  searchcode  [17].  The  two  selected  options  were  the  Open  Hub  Code  Search 
and  GitHub. 

The  Open  Hub  Code  Search,  by  Black  Duck  Software  [18],  with  its  21,372,664,482  lines  of  open  source  code, 
claims  to  be  the  world's  largest  and  most  comprehensive  code  search  engine  [16].  It  is  the  result  of  the  merge  in 
2012  with  Koders,  another  open  source  code  search  engine  acquired  by  Black  Duck  Software  in  2008.  Through 
the  Open  Hub  Code  Search,  Black  Duck  Software  wants  to  fill  the  gap  left  by  the  shutdown  of  Google  Code 
Search  in  2012. 

GitHub  is  a  web-based  hosting  service  for  software  development  projects  using  Git  [19]  at  its  heart.  Git  is  a  free 
and  open  source  distributed  version  control  system.  GitHub  claims  to  be  the  largest  code  host  on  the  planet  with 
over  21.2  million  repositories  [20].  One  advantage  that  GitHub  has  over  the  Open  Hub  Code  Search  is  its  robust 
API,  which  allows  the  integration  of  third-party  tools  or  applications  [20]. 

BinSourcerer  is  the  prototype  implementing  the  assembly  to  source  code  matching  technique.  It  draws  its 
inspiration  from  RE-Google,  but  instead  of  submitting  queries  to  Google  Code,  it  relies  on  the  Open  Hub  Code 
Search  and  GitHub  to  correlate  assembly  with  source  code.  BinSourcerer  takes  as  input  a  target  binary  file 
disassembled  with  IDA  Pro  and  performs  the  following  steps  for  each  function:  (i)  extraction  of  interesting 
features  (i.e.,  strings,  constants,  and  imported  function  names),  (ii)  feature-based  query  encoding,  (iii)  query 
refinement  for  on-line  code  repository  search  (i.e.,  the  Open  Hub  Code  Search  and  GitHub),  (iv) 
request/response  processing,  (v)  data  extraction  and  parsing,  and  (vi)  results  reporting. 

BinSourcerer  has  been  released  as  open  source  code  on  GitHub1  under  the  Apache  License,  Version  2.0.  The 
following  are  the  different  usage  scenarios  it  supports,  illustrated  with  examples. 

2.3  Exact  Matching 

The  perfect  scenario  in  assembly  to  source  code  matching  is  when  the  source  code  of  an  assembly  code  function 
is  found  on  a  public  repository.  This  is  illustrated  in  the  following  example,  where  the  corresponding  source 
code  of  the  Citadel  Trojan  function  sub_42514F  (Figure  1)  has  been  found  on  GitHub  (Figure  2). 


1  https://github.com/BinSigma/BinSourcerer 
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text 

0042514F 

;  == 

- S  U 

B  R  0  U  T  I  N  E - = 

text 

0042514F 

text 

004251 4F 

text 

004251 4F 

sub 

4251 4F 

proc 

near  ;  CODE  XREF :  j_CER_sub_42514Ft j 

text 

004251 4F 

;  injected_thread_start+C0tp 

text 

004251 4F 

push 

ebx 

text 

00425150 

push 

edi 

text 

00425151 

push 

oFfset  szSubsystemProtocol  ;  "MV" 

text 

00425156 

push 

0  ;  hProu 

text 

00425158 

xor 

bl,  bl 

text 

00425150 

call 

ds:CertOpenSystemStoreM 

text 

00425160 

mou 

edi,  eax 

text 

00425162 

test 

edi,  edi 

text 

00425164 

U 

short  loc_425190 

text 

00425166 

push 

ebp 

text 

00425167 

mou 

ebp ,  ds:CertEnumCertificatesInStore 

text 

0042516D 

push 

esi 

text 

0042516E 

push 

0 

text 

text 

00425170 

00425172 

jmp 

short  loc_425185 

text 

00425172 

text 

00425172 

loc 

425172: 

;  CODE  XREF:  sub_42514F+3D|j 

text 

00425172 

push 

esi  ;  pCertContext 

text 

00425173 

call 

ds:CertDuplicateCertificateContext 

text 

00425179 

test 

eax,  eax 

text 

004251 7B 

U 

short  loc_425184 

text 

004251 7D 

push 

eax  ;  pCertContext 

text 

004251 7E 

call 

ds:CertDeleteCertificateFromStore 

text 

00425184 

text 

00425184 

loc 

425184: 

;  CODE  XREF:  sub_42514F+2Ct j 

text 

00425184 

push 

esi  ;  pPreuCertContext 

text 

00425185 

text 

00425185 

loc 

425185: 

;  CODE  XREF:  sub_42514F+21 t j 

text 

00425185 

push 

edi  ;  hCertStore 

text 

00425186 

call 

ebp  ;  CertEnumCertificatesInStore 

text 

00425188 

mou 

esi,  eax 

text 

00425188 

test 

esi,  esi 

text 

004251 8C 

jnz 

short  loc_425172 

text 

0042518E 

push 

eax  ;  dwFlags 

text 

0042518F 

push 

edi  ;  hCertStore 

text 

00425190 

mou 

bl,  1 

text 

00425192 

call 

ds:  Cert Closest ore 

text 

00425198 

pop 

esi 

text 

00425199 

pop 

ebp 

text 

00425190 

text 

00425190 

loc 

425190: 

;  CODE  XREF:  sub_42514F+15t j 

text 

00425190 

pop 

edi 

text 

004251 9B 

mou 

al,  bl 

text 

004251 9D 

pop 

ebx 

text 

004251 9E 

retn 

text 

004251 9E 

sub 

4251 4F 

endp 

Figure  1:  Citadel  sub_42514F  function  in  IDA  Pro. 
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bool  ClearSerts(  const  char*  nameStore  ) 

1(59  { 

170  bool  ret  =  false; 

HANDLE  h store  =  CertOpenSystemStore(  NULL,,  nameStore  ); 
if (  h store  J=  NULL) 

{ 

P C C E R T_CON T EXT  certContext  =  0; 

while(  (certContext  =  CertEnumCertificatesInStore(  hstorej  certContext  ))  !=  MULL  ) 

177  { 

PCCERT_CGNTEXT  dupCertContext  =  CertDuplicateCertif icateContext ( certContext ) ; 
if (  dupCertContext  f  =  NULL  ) 

Ce rtDel eteCert if ic ate FromStoreC dupCertContext); 

1S1  } 

ret  =  true; 

CertCloseStore(  hstorej  0  ); 

1B4  } 

return  ret; 

1S6  > 

IBS  void  ClearDataSert (  DataSertfi  dataSert  ) 

1S9  { 

190  LocalFree ( dataSert . pfxBlob . pbData); 

191  dataSert . pfxBlob . pbData  =  0; 

192  dataSert . pfxBlob . cbData  =  0; 

193  > 

Figure  2:  Sert.cpp  file  excerpt  from  GitHub. 

Figure  2  displays  a  subset  of  the  file  Sert .  cpp  found  on  the  GitHub  Carberp2  repository.  All  the  calls  to  the 
Windows  API  functions  (coloured  in  pink)  in  Figure  1  are  also  present  in  Figure  2,  as  shown  in  Table  1.  The 
presence  of  the  letter  w  appended  at  the  end  of  the  function  name  CertOpenSystemStoreW  in  the  disassembly 
and  not  in  the  source  code  listing  is  due  to  the  fact  that  there  are  two  versions  of  CertOpenSystemStore.  One 
for  ASCII  strings  (ending  with  an  A)  and  one  for  Unicode  (ending  with  a  w).  In  the  present  case,  the 
CertOpenSystemStore  function  was  compiled  for  Unicode. 


Table  1:  Correspondence  between  assembly  and  source  code. 


Function  Name 

IDA  Pro  Virtual  Address 

Source  Code  Line  Number 

CertOpenSystemStoreW 

0042515A 

172 

CertEnumCertifi cates InStore 

00425167 

176 

CertDuplicateCertif icateContext 

00425173 

178 

CertDeleteCertif i cat eFromS tore 

0042517E 

180 

CertCloseStore 

00425192 

183 

Figure  1  shows  that  IDA  Pro  was  also  able  to  extract  the  string  “my”  at  the  address  00425150,  which  is  passed 
as  a  parameter  to  the  function  CertOpenSystemStoreW.  This  string  is  also  present  in  the  file  Sert .  cpp.  It  is 
initialized  at  line  51  (Figure  3)  and  its  pointer  is  passed  as  a  parameter  to  the  CertOpenSystemStore  function 
at  line  172  (Figure  2). 

This  example  illustrates  how  being  able  to  match  assembly  with  its  corresponding  source  code  greatly 
accelerates  the  reverse  engineering  process.  The  latter  is  at  a  higher  level  of  abstraction  and  is  thus  easier  to 


2  https://github.com/hzeroo/Carberp/blob/6d449afaa5fd0d0935255d2fac7c7f6689e8486b/source%20- 

%20absource/pro/all%20source/sert/sert.cpp 


IST-1 28-RWS-01 9 


Approved  for  public  release;  distribution  unlimited. 


UNCLASSIFIED 

[27] 


5 


UNCLASSIFIED 


Software  Correlation  for  Malware  Characterization 


organization 


understand.  In  this  case,  it  serves  to  clearly  illustrate  how  the  different  Windows  Certificate  Store  functions  used 
for  cryptography  and  present  in  Citadel  are  related. 


#pragma  library ("Crypt 32. lib"); 

typedef  void  (WIN API  *type_GetSert)( const  char*,  const  char*); 
typedef  void  (WINAPI  *typeGetSertDef suit) ( ) ; 

struct  DataSert  { 

CRYPTJ>ATA_BLOB  pfxBlob;  //naSbebeeab 
char*  name;  //eiy  oSaieeeua 
WCHAR  password [128]; 

int  count;  f/eiee-tanbai  naSbeaeiabia  a  dSbieeeua 

}y 


bool  GetSertf  DataSertfi  ); 

bool  PiitSertC  DataiSertfi  ); 

void  ClearData5ert(  DataSertfi  ); 

void  SaveSert(  DatsSertfi,  const  char*  ); 

bool  LoadSert(  DataSerti,  const  chair*  ); 

bool  ClearSerts(  const  char*  names tore  ); 


char*  my  =  "My"; 
char*  pass  =  "pass"; 


int  main() 

34  { 

33  /* 

36  HMODULE  dil  =  LaadLibrary(  "ExportSert.dLL”  ); 

37  type_GetSert  GetSert  =  (type_G€tSert)G€tProcAddre5s(  diL,  "GetSert"  ); 

3B  type_GetSer tDefau 1 1  GetSer tDefau 1 1  =  (type_Get5ertDe/auItJ«jetProcAddres5(  dlij  "Get SertDe fault"  }; 

48  GetSer  tDefau  Lt( ); 

42  FreeL ibrary  (di i ) ; 

43  */ 

44  /* 

43  bool  res  =  CLearSerts(  "fly"  ); 

46  if(  res  ) 

47  print f(  "good"); 

48  else 

49  printf("bad"}j 

58  */ 

51  char*  raameStore  =  nMyn; 

52  char*  password  =  "pass"; 

53  char*  name File  =  "My_sert. pfx"; 

53  DataSert  dataSert; 

5-  dataSert.pfxBlob.pbData  =  0; 

dataSert.pfxBlob.cbData  =  0; 

58  dataSert. name  =  nameStore; 

Figure  3:  Sert.cpp  file  excerpt  on  GitHub. 

2.4  Close  Matching 

With  close  matching,  although  the  exact  corresponding  source  code  cannot  be  found  on  a  public  repository,  the 
fact  that  certain  assembly  code  features  (e.g.,  strings,  constants)  are  matched  can  provide  additional  information. 
This  can  sometimes  reveal  the  performed  actions  of  a  function.  This  is  best  illustrated  with  the  following 
example.  Figure  6  displays  the  assembly  code  listing  of  a  malicious  Secure  Shell  (SSH)  client.  IDA  Pro  was  able 
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to  extract  the  string  xll-req  at  the  address  806721A.  This  string  was  also  found  on  the  Open  Hub  Code 
Search,  in  the  channels  .  c3  file  of  the  MirOS  Project. 


text :08067143 

nou 

eax,  [ebp+s] 

text:08 067146 

mou 

[ebp+uar_30] , 

eax 

text :08067149 

nou 

[esp],  eax 

;  s 

text : 0806714C 

call 

strlen 

text:08 067151 

nou 

esi,  eax 

text:08 067153 

nou 

eax,  ds: duord 

8097F88 

text:08 067158 

test 

eax,  eax 

text : 0806715A 

jz 

loc_8067356 

text:08 067160 

nou 

[esp+4],  eax 

;  arg 

text:08 067164 

nou 

[esp],  ebx 

;  si 

text:08 067167 

call 

strcnp 

text : 0806716C 

test 

eax,  eax 

text : 0806716E 
text:08 067174 

jnz 

loc  8067345 

.text :08067174  loc_8067174:  ;  CODE  XREF :  sub_8067120+243j,j 


text:08 067174 

nou 

duord  ptr  [esp+4],  3Ah  ;  c 

text : 0806717C 

nou 

[esp],  ebx  ;  s 

text : 0806717F 

call 

_strchr 

text:08 067184 

test 

eax,  eax 

text:08 067186 

jz 

loc_8 067280 

text : 0806718C 

nou 

duord  ptr  [esp+4],  2Eh  ;  c 

text:08 067194 

nou 

[esp],  eax  ;  s 

text:08 067197 

call 

_strchr 

text : 0806719C 

test 

eax,  eax 

text : 0806719E 

jz 

loc_8067280 

text : 08067184 

add 

eax,  1 

text : 08067187 

nou 

duord  ptr  [esp+14h],  0 

text : 080671AF 

shr 

esi,  1 

text : 080671B1 

nou 

duord  ptr  [esp+0Ch],  190h 

text : 080671B9 

nou 

duord  ptr  [esp+10h],  0 

text : 080671C1 

nou 

duord  ptr  [esp+4],  0 

text : 080671C9 

nou 

duord  ptr  [esp+8],  0 

text : 080671D1 

nou 

[esp],  eax 

text : 080671D4 

call 

_strtonun 

text : 080671D9 

nou 

[ebp+size],  esi 

text : 080671DC 

nou 

[ebp+uar_20] ,  eax 

text : 080671DF 

nou 

eax,  ds :duord_8097F8C 

text : 080671E4 

test 

eax,  eax 

text : 080671E6 

jz 

loc_8  067299 

text : 080671EC 

text : 080671EC  loc  80671EC: 
text : 080671EC 
text : 080671EC 

nou 

;  CODE  XREF:  sub  8067120+173;j 
;  sub_8 067120+255^ j 

edx,  [ebp+size] 

text : 080671EF 

nou 

eax,  ds :duord_8097F94 

text : 080671F4 

nou 

[esp+4],  edx 

text : 080671F8 

nou 

[esp],  eax 

text : 080671FB 

call 

sub_8 07E050 

text:08 067200 

nou 

ebx,  eax 

text:08 067202 

nou 

eax,  ds :duord_8  098  0F8 

text  :08  067207 

test 

eax,  eax 

text  :08  067209 

jz 

loc_8 067332 

text :  08  06720F 

nou 

eax,  duord  ptr  [ebp+arg] 

text:08 067212 

nou 

duord  ptr  [esp+8],  0  ;  int 

text : 08067218 

nou 

duord  ptr  [esp+4],  oFFset  aXIIReq  ;  "xll-req'* 

text :08067222 

nou 

[esp],  eax  ;  arg 

text :08067225 

call 

sub  8067060 

Figure  4:  Assembly  code  listing  of  a  malicious  executable. 


3  http://code.openhub.net/file?fid=h2710e3rYxXkxgFY3ZUnlSzqaXc&cid=rWQJw- 
YTJwg&s=&pp=0&fp=37 1 63 6&ff=  1  &filterChecked=true&mp=  1  &ml= 1  &me= 1  &md=  1  #L() 
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3339 

3340 

3341 

3342 

3343 

3344 

3345 

3346 

3347 

3348 

3349 

3350 

3351 

3352 

3353 

3354 

3355 

3356 

3357 

3358 

3359 

3360 

3361 

3362 

3363 

3364 

3365 

3366 

3367 

3368 

3369 
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3371 
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3373 

3374 

3375 

3376 

3377 

3378 

3379 

3380 

3381 

3382 

3383 

3384 

3385 

3386 

3387 

3388 

3389 

3390 

3391 

3392 

3393 


if  (xll_saved_di splay  ==  NULL) 

xl 1 _ 3  aved_di splay  =  xstrdup (disp) ; 

else  if  (strcirspfdisp,  xll_saved_di splay)  !=  0)  { 

s  r  r o  r  ( "xll_reque  s t_f o rwa rding_with_sp oof  ing :  di  f  f e  rent  " 
"^DISPLAY  already  forwarded")  ; 
return; 

} 

cp  =  strchr(dispr  T:’); 
if  (cp} 

cp  =  strohr (cp,  ’ . ; 

if  (cp} 

screen_numiber  =  (u_int}  strtonum  (cp  +  1,  0  ,  400,  NULL)  ; 

else 

screen_nuirber  =  0; 

if  (xl l_s a ved_proto  ==  NULL)  { 

/*  Save  protocol  name .  */ 

xl l_s  aved_proto  =  xstrdup (proto) ; 

/* 

*  Extract  real  authentication  data  and  generate  fate  data 

*  of  the  same  length. 

*/ 

xl  l_s  a  ved_dat  a  =  xmelloc(data_len); 
xll_fali:e_data  =  xir.alloc (data_len) ; 
for  (i  =  0;  i  <  data_len;  i++)  { 

if  (sscenf(data  +  2  *  i,  "%2x",  & value}  !=  1) 
fatal ("xll_request_forwaxding:  tad  " 

"authentication  data:  %.100s",  data) ; 
xl l_s aved_da t  a [ i ]  =  va lue ; 

} 

arc4random_buf (xll_fate_data,  data_len) ; 
xll_saved_data_len  =  data_len; 
xll_fafce_data_len  =  data_len; 

} 

/*  Convert  the  fate  data  into  hex.  */ 
new_data  =  tohex (itll_f ate_data ,  data_len) ; 

/*  Send  the  request  pactet.  */ 
if  (coirpat20)  { 

channel_request_start (client_session_id,  "xll-req",  0); 
p a c te t_put_cha r (0 ) ;  /*  XXX  bool  single  connection  */ 

}  else  { 

pactet_start  ( 5  5H_CM5G_X1 1  _RE QTJE  S T _ FC-RN ARD I NG )  ; 

} 

pec  bet_put_cst ring (proto) ; 
p  a  c  te  t_put_cs  t  ring (new_dat a ) ; 
pactet_put_int  (screen_nuirfcer) ; 
pactet_send()  ; 
pactet_write_wait  (}  ; 
xf ree (new_data) ; 


Figure  5:  channels.c  file  excerpt  on  the  Open  Hub  Code  Search. 

The  MirOS  project  is  a  secure  operating  system  from  the  BSD  family  for  32-bit  i386  and  SPARC  systems 
[21].  Its  file  channel,  c  comes  from  the  OpenBSD  project  [22].  The  fact  that  the  string  xll-req  was 
retrieved  on  the  Open  Hub  Code  Search  in  the  channel .  c  file  allows  the  reverse  engineer  to  know  that  the 
disassembled  code  displayed  in  Figure  4  is  used  for  initiating  Xllconnection  forwarding.  This  is  mentioned  in 
the  comments  at  the  beginning  of  the  file  (Figure  6). 
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/*  SOpenBSD:  channels. c,v  1.296  2009/05/25  SG:4B:00  andreas  Exp  $  */ 

/* 

*  Author:  Tatu  Ylonen  <yloifflcs.fiut ,fi> 

*  Copyright  (c)  1995  Tatu  Ylonen  <yIoi§cs .  hut  .Fi>,  Espoo,  Fi  nil  and 

*  All  rights  reserved 

*  This  file  contains  Functions  For  generic  socket  connection  forwarding. 

*  There  is  also  code  For  initiating  connection  forwarding  For  XI 1  connections, 

*  arbitrary  tcp/ip  connections,  and  the  authentication  agent  connection. 

* 

*  As  far  as  I  am  concerned,  the  code  I  have  written  for  this  soFtware 

*  can  be  used  freely  For  any  purpose.  Any  derived  versions  of  this 

*  software  must  be  clearly  marked  as  such,  and  if  the  derived  work  is 

*  incompatible  with  the  protocol  description  in  the  RFC  file,  it  must  be 

*  called!  by  a  name  other  than  "ssh"  or  "Secure  Shell". 

* 

*  SSH2  support  added  by  Markus  Friedl. 

*  Copyright  (c)  1999,  2000,  2001,  2002  Markus  Friedl.  All  rights  reserved. 

*  Copyright  (c)  1999  Dug  Song.  All  rights  reserved. 

*  Copyright  (c)  1999  Theo  de  Raadt.  All  rights  reserved. 

* 

*  Redistribution  and  use  in  source  and  binary  forms,  with  or  without 

*  modification,  are  permitted  provided  that  the  following  conditions 

*  are  met: 

*  1.  Redistributions  of  source  code  must  retain  the  above  copyright 

*  notice,  this  list  of  conditions  and  the  following  disclaimer. 

*  2.  Redistributions  in  binary  Form  must  reproduce  the  above  copyright 

*  notice,  this  list  of  conditions  and  the  following  disclaimer  in  the 

*  documentation  and/or  other  materials  provided  with  the  distribution. 

* 

*  THIS  SOFTWARE  IS  PROVIDED'  BY  THE  AUTHOR  ’"AS  IS 1 r  AND  ANY  EXPRESS  OR 

*  IMPLIED  WARRANTIES,  INCLUDING,  BUT  NOT  LIMITED  TO,  THE  IMPLIED  WARRANTIES 

*  OF  MERCHANTABILITY  AND  FITNESS  FOR  A  PARTICULAR  PURPOSE  ARE  DISCLAIMED. 

*  IN  NO  EVENT  SHALL  THE  AUTHOR  BE  LIABLE  FOR  ANY  DIRECT,  INDIRECT, 

*  INCIDENTAL,  SPECIAL,  EXEMPLARY,  OR  CONSEQUENTIAL  DAMAGES  (INCLUDING,  BUT 

*  NOT  LIMITED  TO,  PROCUREMENT  OF  SUBSTITUTE  GOODS  OR  SERVICES;  LOSS  OF  USE, 

*  DATA,  OR  PROFITS;  OR  BUSINESS  INTERRUPTION)  HOWEVER  CAUSED  AND  ON  ANY 

*  THEORY  OF  LIABILITY,  WHETHER  IN  CONTRACT,  STRICT  LIABILITY,  OR  TORT 

*  (INCLUDING  NEGLIGENCE  OR  OTHERWISE)  ARISING  IN  ANY  WAY  OUT  OF  THE  USE  OF 

*  THIS  SOFTWARE,  EVEN  IF  ADVISED  OF  THE  POSSIBILITY  OF  SUCH  DAMAGE. 

Figure  6:  Comments  in  the  channels.c  file  on  the  Open  Hub  Code  Search. 


2.5  Contextual  Matching 


Contextual  matching  is  the  process  of  characterizing  the  disassembled  code  under  study  by  pairing  it  with 
some  known  source  code.  In  this  case,  although  the  exact  or  closely  corresponding  source  code  cannot  be 
found,  searching  on  open  source  code  repositories  can  still  provide  information  about  the  performed  actions  of 
a  function.  This  is  what  happened  for  Citadel  when  an  approximate  code  matching  identified  a  video-related 
capability.  Although  the  matching  process  was  not  perfect,  it  was  accurate  enough  to  reveal  the  context  of  the 
function.  The  video  capture  capability  of  Citadel  was  unleashed  through  links  to  source  code  files  on  the 
Black  Duck  Open  Hub  Code  Search  such  as  MHRecordContol . h,  stopRecord.  c,  trackerRecorder . h, 
signalRecorder . h,  and  waitRecord.  c.  The  links  found  for  the  files  were  added  as  comments  in  the 
disassembly  of  Citadel,  as  shown  in  Figure  7.  This  observation  was  further  supported  by  the  fact  that  the  API 
de-obfuscation  of  Citadel  revealed  the  presence  of  strings  such  as  _startRecordl6.  Also,  a  video  start 
command  was  also  found  as  part  of  the  process. 
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.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 

.text 


0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 

0040A2F2 


===============  SUBROUTINE  ======================================= 

Online  _  waitRecord.c  _  http : //code .ohloh .net/File?Fid=ZxwzuBpNLs2D3LtAc9FF 
Online  _  trackrecorder .h  _  http  ://code  .ohloh  .net/File?Fid=DTr6Gip5YzqFbdATS 
Online  _  signalrecorder .h  _  http ://code .ohloh .net/File?Fid=27kLwV78V3FCgebj 
Online  _  waitRecord.c  _  http : //code .ohloh .net/File?Fid=ZxwzuBpNLs2D3LtAc9FF 
Online  _  StopRecord .cc  _  http  ://code  .ohloh  .net/File?Fid=-sUkIUC0Sgwje5adFox 
Online  _  signalrecorder .h  _  http ://code .ohloh .net/File?Fid=27kl_wY78Y3FCgebj 
Online  _  StopRecord  .cc  _  http ://code .ohloh .net/File?Fid=-sUkIUC0Sgwje5adf ox 
Online  _  numberrecorder .h  _  http ://code .ohloh .net/File?f id=qmeu9drTapAmQ_-E 
Online  _  numberrecorder .h  _  http ://code .ohloh .net/File?Fid=qmeu9drTapAmQ_-E 
Online  _  trackrecorder .h  _  http ://code .ohloh .net/File?Fid=DTr6Gip5VzqFbdATS 


text 

text 

text 

text 

text 

text 

text 

text 

0040A2F2 

O040A2F2 

0040A2F2 

0040A2F9 

0040A2FA 

0040A2FC 

0040A2FE 

0040A2FF 

sub 

40A2F2 

proc  near  ;  CODE  XREF :  uideo  record  1 

;  MTX_sub_40A48E+Cip 

cup  byte_438B28 ,  0 

push  ebx 

jz  short  loc_40A300 

moo  al,  1 

pop  ebx 

retn 

UllHlIHo  II II 

text 

0040A300 

text 

0040A300 

loc 

40A300: 

;  CODE 

XREF:  sub_40A2F2+8t 

text 

0040A300 

xor 

ebx,  ebx 

text 

0040A302 

inc 

ebx 

text 

0040A303 

push 

ebx 

text 

0040A304 

call 

sub_413086 

text 

0040A309 

test 

eax,  eax 

text 

0040A30B 

jz 

loc_40A391 

text 

0040A311 

push 

oFFset  a _ startrecordS 

;  " _ startRecord@16‘ 

text 

0040A316 

push 

ebx 

text 

0040A317 

call 

sub_4130D8 

text 

0O40A31C 

push 

oFFset  a _ stoprecord@4 

;  " _ stopRecord@4" 

text 

0040A321 

push 

ebx 

text 

0040A322 

mou 

di'jord  438B2C ,  eax 

text 

0040A327 

call 

sub_4130D8 

text 

0040A32C 

push 

oFFset  a _ Freerecord@4 

;  " _ FreeRecord@4" 

text 

0040A331 

push 

ebx 

text 

0040A332 

mou 

dword  438B30,  eax 

text 

0040A337 

call 

sub_4130D8 

text 

0040A33C 

push 

oFFset  a _ isrecord@4  ; 

" _ isRecord@4" 

text 

0040A341 

push 

ebx 

text 

0040A342 

mou 

dword  438B34,  eax 

text 

0040A347 

call 

sub_4130D8 

text 

0040A34C 

push 

oFFset  a _ waitrecord@8 

;  " _ waitRecord@8" 

text 

0040A351 

push 

ebx 

text 

0040A352 

mou 

dword  438B38 ,  eax 

text 

0040A357 

call 

sub  4130D8 

Figure  7:  Video  capture  capability  discovered  in  Citadel. 

Although  contextual  matching  is  far  from  being  the  ideal  scenario,  the  high-level  information  it  provides  for  a 
function  saves  the  reverse  engineer  from  manually  analyzing  it.  With  Citadel  having  approximately  800 
functions,  any  function  for  which  the  reverse  engineer  will  not  have  to  deal  with  assembly  code  analysis  is  a  time 
saver. 


3.0  ASSEMBLY  CODE  CLONE  DETECTION 

During  the  last  few  years,  the  sophistication  of  malware  has  considerably  evolved  and  has  thus  complicated  the 
reverse  engineering  process.  While  malware  used  to  consist  of  small  programs  written  mostly  in  assembly, 
which  spread  by  infecting  other  executable  files,  today’s  malware  programs  are  written  using  high-level 
languages,  come  in  many  forms  (e.g.,  botnets,  rootkits,  malicious  document  files),  and  each  new  variant 
improves  on  the  previous  ones,  by  adding  new  capabilities  and  fixing  bugs.  Also,  as  developing  stealthy  and 
persistent  malware  requires  a  high  degree  of  technical  complexity,  it  is  quite  common  for  code  fragments  to  be 
reused  between  different  malware. 

The  fact  that  malware  authors  share  source  code  among  them  [23,  24],  have  adopted  a  versioning  approach,  and 
use  evasion  techniques  to  bypass  antivirus  detection  have  resulted  in  a  proliferation  of  malware.  Since  retrieving 


10 


Approved  for  public  release;  distribution  unlimited. 


UNCLASSIFIED 

[32] 


IST-1 28-RWS-01 9 


NATO 

OTAN 


UNCLASSIFIED 


Software  Correlation  for  Malware  Characterization 


the  open  source  origin  of  a  malware  code  fragment  is  not  always  possible,  reverse  engineers  should  thus  leverage 
the  code  reuse  in  the  production  of  malware  and  be  able  to  correlate  different  malware  programs  to  identify  the 
similarities  between  them  and  thereby,  the  code  fragments  they  share.  This  would  prevent  them  from  reanalyzing 
the  code  fragments  of  a  new  malware,  which  have  already  been  analyzed  in  a  previous  context,  and  instead  focus 
their  attention  on  the  new  functionalities  of  the  malware  under  study. 

The  problem  of  correlating  different  code  fragments  is  closely  related  to  the  research  area  of  clone  detection. 
Clone  detection  is  a  technique  to  identify  duplicate  code  fragments  in  a  code  base.  Traditionally,  it  has  been  used 
to  decrease  the  code  size  by  consolidating  it  and  thus,  facilitate  program  comprehension  and  software 
maintenance.  This  need  stems  from  the  fact  that  reusing  code  fragments  by  copying  and  pasting  them,  with  or 
without  modifications,  is  a  common  scenario  in  software  development  and  can  be  detrimental  to  software 
maintenance  and  evolution.  For  example,  if  a  bug  is  found  in  a  code  fragment,  then  all  similar  code  fragments 
must  also  be  verified  for  the  presence  of  this  bug. 

As  clone  detection  is  an  important  problem,  it  has  been  studied  extensively  and  numerous  clone  detection 
algorithms  exist.  However,  most  existing  clone  detection  algorithms  operate  on  source  code  and  these  solutions 
are  not  directly  applicable  to  assembly  code.  One  important  application  of  clone  detection  on  binary  code  is  the 
detection  of  copyright  infringements.  For  example,  closed  source  software  should  not  contain  open  source  code 
released  under  the  GNU  General  Public  License  (GPL).  Applying  clone  detection  to  the  problem  of  malware 
analysis  is  challenging,  due  to  the  evasion  techniques  used  by  malware  authors  to  produce  syntactically  different 
executable  code,  but  semantically  performing  the  same  malicious  functionality. 

3.1  Objective 

The  objective  of  clone  detection  is  to  identify  code  fragments  of  high  similarity  from  a  large  code  base.  The 
major  challenge  is  that  the  clone  detector  usually  does  not  know  beforehand  which  code  fragments  may  be 
repeated.  Therefore,  a  naive  clone  detection  approach  might  need  to  compare  every  pair  of  code  fragments.  Such 
a  comparison  is  prohibitively  expensive  in  terms  of  computation  and  is  infeasible  to  perform  in  many  real-life 
scenarios.  But  given  a  collection  of  previously  analyzed  assembly  files  and  a  target  assembly  code  fragment, 
such  as  in  the  case  of  malware  analysis,  the  objective  is  not  to  identify  all  the  duplicate  code  fragments.  It  is  only 
to  identify  all  the  code  fragments  in  the  previously  analyzed  assembly  files  that  are  similar  to  the  target  fragment. 
This  problem  is  known  as  assembly  code  clone  search. 

A  code  fragment  is  any  sequence  of  assembly  code  instructions,  with  or  without  comments,  at  any  granularity 
level  (e.g.,  function,  basic  block).  A  code  fragment  is  a  clone  of  another  code  fragment  if  they  are  similar 
according  to  a  given  definition  of  similarity  [25].  In  clone  detection  (or  search),  code  fragments  can  be  similar 
based  on  their  program  text  (textual  similarity)  or  functionality  (functional  similarity).  In  the  literature,  code 
clones  have  been  classified  into  Type  I,  II,  III  (textual  similarity),  and  IV  (functional  similarity)  [26].  The  results 
of  clone  detection  take  the  form  of  clone  pairs.  A  pair  of  code  fragments  is  called  a  clone  pair  if  there  exists  a 
clone-relation  between  them  (i.e.,  a  clone  pair  is  a  pair  of  code  fragments  which  are  identical  or  similar  to  each 
other)  [26]. 


3.1.1  Type  I  Clones 

A  Type  I  clone  is  when  two  or  more  code  fragments  are  identical  except  for  variations  in  whitespace,  layout,  and 
comments.  In  the  example  of  Figure  8,  the  only  difference  between  the  two  code  fragments  is  the  presence  of  the 
Memory  comment  indicated  in  red  at  line  1. 
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1 

push 

eax  ;  Memory 

push 

2 

call 

ds :  aligned  free 

call 

3 

and 

dword  ptr  [esi] ,  0 

and 

4 

pop 

ecx 

pop 

eax 

ds :_aligned_f ree 
dword  ptr  [esi] ,  0 

ecx 


Figure  8:  Type  I  clone  example. 


3.1.2  Type  II  Clones 

Type  II  clones  are  structurally  and  syntactically  identical  code  fragments  except  for  variations  in  identifiers, 
literals,  types,  layout,  and  comments.  In  Figure  9,  the  only  difference  between  the  two  code  fragments  is  that  for 
some  instructions,  they  use  different  constants,  variable  names,  and  labels.  For  example,  for  the  assembly  code 
instruction  at  line  5,  the  instruction  on  the  left  uses  the  constant  24h  and  the  variable  name  var_C,  while  its 
corresponding  instruction  on  the  right  uses  the  constant  2 Oh  and  the  variable  name  inBuffer.  Similar 
differences  also  apply  for  the  instructions  at  line  15.  For  the  jnz  instruction  at  line  17,  the  loc_10001A97  label 
is  used  on  the  left,  while  the  1  o  c_l  0  0  0 1 4  9  3  label  is  used  for  the  instruction  on  its  right. 


1 

push 

edi 

r 

Size 

push 

edi 

;  Size 

2 

call 

malloc 

call 

malloc 

3 

mov 

edx, 

eax 

mov 

edx. 

eax 

4 

mov 

ecx. 

edi 

mov 

ecx. 

edi 

5 

mov 

[esp+24h+var  C] , 

edx 

mov 

[esp+2C  InBuffer] ,  edx 

6 

mov 

edi , 

edx 

mov 

edi , 

edx 

7 

mov 

edx. 

ecx 

mov 

edx. 

ecx 

8 

xor 

eax. 

eax 

xor 

eax. 

eax 

9 

shr 

ecx. 

2 

shr 

ecx. 

2 

10 

rep 

stosd 

rep 

stosd 

11 

mov 

ecx. 

edx 

mov 

ecx. 

edx 

12 

add 

esp. 

4 

add 

esp. 

4 

13 

and 

ecx. 

3 

and 

ecx. 

3 

14 

rep 

stosb 

rep 

stosb 

15 

mov 

eax. 

[esp+20h+var 

_C] 

mov 

eax. 

[esp+lCh+  InBuffer] 

16 

test 

eax. 

eax 

test 

eax. 

eax 

17 

jnz 

loc_ 

10001A97 

jnz 

loc_ 

10001493 

18 

mov 

eax. 

[ebx] 

mov 

eax. 

[ebx] 

19 

push 

eax 

push 

eax 

Figure  9:  Type  II  clone  example. 


3.1.3  Type  III  Clones 

A  Type  III  clone  is  a  Type  II  clone  with  further  modifications.  Statements  can  be  changed,  added,  or  removed,  in 
addition  to  variations  in  identifiers,  literals,  types,  layout  and  comments.  In  the  example  of  Figure  10,  the  order 
of  the  two  instructions  at  lines  3  and  4  was  inverted. 
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mov 

esi, 

[ebp+arg  0] 

mov 

esi, 

[ebp+arg  0] 

mov 

edx. 

[esi+214h] 

mov 

edx. 

[esi+214h] 

mov 

edi , 

[esi+220h] 

mov 

[ebp+var  4] ,  edx 

mov 

[ebp+var_4] ,  edx 

mov 

edi , 

[esi+220h] 

cmp 

[esi+21Ch] ,  edi 

cmp 

[esi+21Ch] ,  edi 

jl 

short 

loc_7  6  641044 

jl 

short 

loc_7  6  641044 

lea 

ebx. 

[edx+edi*8 ] 

lea 

ebx, 

[edx+edi*8 ] 

Figure  10:  Type  III  clone  example. 


3.1.4  Type  IV  Clones 

A  Type  IV  clone,  also  known  as  a  semantic  clone,  occurs  when  two  or  more  code  fragments  perform  the  same 
computation,  but  using  different  syntactic  variants.  In  the  example  of  Figure  11,  the  two  code  fragments  carry 
out  the  same  function,  i.e.,  compute  the  length  of  a  string.  However,  as  it  can  be  seen,  their  implementation 
differs  significantly. 


1 

strlenl  proc 

near 

2 

3 

arg  0  =  dword  ptr 

4 

4 

5 

mov 

eax, 

[esp+arg 

.0] 

6 

7 

loc_401004 : 

8 

cmp 

byte 

ptr  [eax] 

,  0 

9 

jz 

short 

done 

10 

inc 

eax 

11 

jmp 

short 

loc  401004 

12 

13 

done  : 

14 

sub 

eax. 

[esp+arg 

0J 

15 

retn 

16 

strlenl  endp 

strlen3  proc  near 
arg_0  =  dword  ptr  4 


push 

edi 

mov 

edi , 

[esp+4+arg 

xor 

ecx. 

ecx 

not 

ecx 

xor 

old 

al. 

al 

repne 

scasb 

not 

ecx 

lea 

eax. 

[ecx-1 ] 

pop 

retn 

edi 

strlen3  endp 


Figure  11:  Type  IV  clone  example. 


3.2  Exact  and  Inexact  Code  Clones 

The  above  definitions  for  the  different  code  clones  types  are  commonly  used  in  the  literature  for  source  code 
clone  detection  [26].  However,  they  are  not  directly  applicable  to  assembly  code.  For  example,  although 
possible,  Type  I  clones  seldom  occur  in  assembly  code  and  are  thus  irrelevant.  For  this  reason  and  to  simplify 
matters,  the  notion  of  exact  and  inexact  code  clones  is  introduced.  As  illustrated  in  Table  2,  an  exact  clone 
corresponds  either  to  a  Type  I  or  Type  II  clone,  while  an  inexact  clone  corresponds  to  a  Type  III  or  Type  IV 
clone. 
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Table  2:  Source  vs.  assembly  code  clones. 


Source  Code 

Assembly  Code 

Type  I  Clone 

Exact  Clone 

Type  II  Clone 

Type  III  Clone 

Inexact  Clone 

Type  IV  Clone 

3.3  BinClone  Prototype 

The  BinClone  prototype  implemented  for  assembly  code  clone  search  is  an  improved  version  of  the  code  clone 
detection  framework  proposed  by  Saebjomsen  et  al.  [27].  Figure  12  provides  an  overview  of  its  eight  processes. 
A  high-level  description  of  each  of  them  is  first  provided,  followed  by  a  detailed  description  of  the  normalization 
and  inexact  clone  detection. 


Figure  12:  Assembly  code  clone  search  process  overview 

1.  Disassembler:  The  input  binaries  are  disassembled  into  assembly  files  using  IDA  Pro. 

2.  Regionizer:  Each  function  identified  by  IDA  Pro  is  partitioned  into  an  array  of  overlapping  regions 
with  a  size  of  at  most  w  instructions,  using  a  sliding  window  with  a  step  size  of  s ,  where  w  and  5  are 
user-specified  parameters.  Figure  13  shows  an  example. 


mov  edi,  edi 

push  ebp 

push  ebp,  esp 

mov  eax,  dword  ptr  [epb+8] 

Figure  13:  Regionizer  with  a  window  size  of  2  and  a  stride  of  1 

3.  Normalizer:  The  constants,  memory  addresses,  and  registers  in  each  region  are  normalized  to 
facilitate  their  comparison  in  the  subsequent  clone  detection  process. 


14 


Approved  for  public  release;  distribution  unlimited. 


UNCLASSIFIED 

[36] 


IST-1 28-RWS-01 9 


NATO 

OTAN 


UNCLASSIFIED 


Software  Correlation  for  Malware  Characterization 


4.  Exact  clone  detector:  A  clone  pair  is  defined  as  an  unordered  pair  of  clone  regions  which  have 
similar  normalized  instructions.  A  clone  cluster  is  a  group  of  clone  pairs.  The  exact  clone  detector 
identifies  clones  among  the  regions  by  comparing  their  instruction  mnemonics.  Two  regions  are 
considered  an  exact  clone  if  and  only  if  all  the  normalized  instructions  in  the  two  regions  are  identical. 
A  naive  approach  to  identify  exact  clones  would  be  to  compare  every  region  pair.  Yet,  this  approach 
is  too  computationally  expensive  with  a  complexity  of  0(n 2),  where  n  is  the  total  number  of  regions. 
Thus,  a  hashing  approach  is  used.  Specifically,  two  regions  are  considered  an  exact  clone  if  they  share 
the  same  hash  value.  The  exact  clone  detector  is  an  improvement  over  the  work  of  Schulman  [28]. 

5.  Inexact  clone  detector:  This  step  extracts  features  for  each  region  and  forms  a  feature  vector, 
denoted  by  v,  for  each  region.  Two  regions  rx  and  ry  are  considered  an  inexact  clone  if  the  similarity 
between  their  feature  vectors,  denoted  by  sim(vx ,  vy),  is  within  a  user-specified  minimum  similarity 
threshold  minS. 

6.  Duplicate  clone  merger:  The  inexact  clone  detector  might  misclassify  two  consecutive  regions  as  a 
clone.  The  duplicate  clone  merger  removes  clones  that  are  just  highly  overlapping  consecutive 
regions.  This  happens  when  the  step  size  s  is  smaller  than  the  windows  size  w. 

7.  Maximal  clone  merger:  As  the  clone  detection  process  operates  on  regions,  the  maximum  size  of  the 
identified  clones  will  correspond  to  the  region  size.  This  prevents  the  identification  of  cloned 
fragments  spread  over  consecutive  cloned  regions.  As  it  is  more  useful  to  identify  a  large  clone  than 
several  smaller  ones,  consecutive  cloned  regions  are  merged  into  a  larger  clone. 

3.3.1  Normalizer 

In  assembly  code,  an  instruction  typically  consists  of  a  mnemonic  (e.g.,  mov)  and  an  operands  list.  Possible 
operands  can  be  a  register  (e.g.,  eax),  a  constant  (e.g.,  0x30  0  04  04  0),  or  a  memory  address  (e.g., 
[ 0x400034 9e ] ).  As  two  or  more  code  regions  can  be  similar  except  for  differences  in  the  instructions 
operands  used,  these  need  to  be  normalized  in  order  to  take  into  account  these  variations.  Different  works  in 
the  literature  were  investigated  and  extensive  experiments  were  performed  on  assembly  code  samples.  These 
revealed  that  different  normalization  techniques  can  result  in  significantly  different  clones.  Therefore,  to  add 
flexibility  to  the  clone  search  process,  the  following  normalization  scheme  was  implemented.  A  constant  is 
normalized  to  VAL.  Similarly,  a  memory  address  is  normalized  to  mem.  Registers  can  be  normalized  according 
to  the  hierarchy  shown  in  Figure  14.  This  figure  also  illustrates  how  the  EAX,  CS,  and  EDI  registers  would  be 
mapped  according  to  the  different  normalization  levels. 
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REG 


eax 

REG 

cs 

REG 

edi 

REG 

REGSeg,  REGGen,  REGIdxPtr 

eax 

REGGen 

cs 

REGSeg 

edi 

REGIdxPtr 

REGGen8, 

REGGenl6,  REGGen32 

eax 

REGGen32 

ax 

REGGenl 6 

ah 

REGGen8 

REGx 

eax 

REGO 

cs 

REG1 

edi 

REG2 

Figure  14:  Normalization  hierarchy  for  registers  and  mapping  examples 


Using  the  more  abstract  normalization  level,  Table  3  illustrates  how  some  sample  assembly  code  instructions 
would  be  normalized. 


Table  3:  Normalized  assembly  code  instructions 


Assembly  Code_ Normalized  Assembly  Code 


mov 

edi  , 

edi 

mov 

REG, 

REG 

push 

ebp 

push 

REG 

push 

ebp. 

esp 

push 

REG, 

REG 

mov 

eax, 

dword  ptr  [epb+8] 

mov 

REG, 

MEM 

3.3.2  Inexact  Clone  Detector 

In  [27],  Saebjornsen  et  al.  proposed  an  inexact  clone  detector  to  identify  clone  pairs  that  are  not  exactly 
identical.  In  general,  their  approach  consists  of  first  extracting  a  set  of  features  from  each  region  and  then 
searching  for  other  code  regions  with  the  same  or  similar  feature  set.  Specifically,  a  feature  vector  is 
constructed  based  on  the  following  five  types  of  features  from  each  region  [27]: 

•  M,  representing  the  mnemonic  of  the  instruction 

•  OPTYPE ,  representing  the  type  of  each  operand  in  an  instruction 

•  M  x  OPTYPE ,  representing  the  combination  of  the  mnemonic  and  the  type  of  the  first  operand,  when 
one  is  present 

•  OPTYPE  x  OPTYPE ,  representing  the  types  of  the  first  and  second  operands,  in  that  order,  of  an 
instruction  with  at  least  two  operands 

Using  the  same  set  of  features,  a  new  approach  which  can  efficiently  identify  inexact  clone  pairs  is  proposed. 
Its  algorithm  can  be  described  in  the  following  four  steps: 
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1.  Compute  median  vector:  The  median  of  each  feature  for  all  regions  is  computed.  The  resulting 
vector  is  called  the  median  vector.  Intuitively,  a  feature  having  a  median  equal  to  zero  implies  that  the 
majority  of  regions  do  not  contain  this  feature.  It  should  thus  be  removed,  as  it  cannot  be  used  to 
differentiate  regions. 

2.  Compute  binary  vectors:  A  binary  vector  is  computed  for  each  region  by  comparing  the  value  of  a 
feature  vector  with  the  corresponding  value  in  the  median  vector.  If  the  feature  value  is  larger  than  the 
corresponding  median,  then  1  is  inserted  into  the  binary  vector.  Otherwise,  0  is  inserted.  For  a  region 
with  feature  values  <0,  2,  1,  4,  1>,  its  binary  vector  would  be  < 0 ,  0,  0,  1,  0>  with  respect  to  the 
median  vector  <1,  5,  2,  3,  3>. 

3.  Hash  binary  vectors:  For  each  binary  vector,  a  hash  key  of  every  k  consecutive  features  is  iteratively 
computed,  where  k  is  a  user-specified  parameter.  The  regions  having  the  same  hash  key  are  put  into 
the  same  bucket  of  a  hash  table.  For  example,  Table  4  shows  that  regions  6,  7,  33,  and  76  are  hashed 
into  the  same  bucket  with  respect  to  the  first  five  consecutive  features.  The  number  of  hash  tables  is 
bounded  by  the  size  of  the  binary  vectors,  i.e.,  the  number  of  features  having  non-zero  medians. 

Table  4:  Hash  table  for  inexact  clone  detection 

Key _ Values  (Region  No.) 

0  8,  9,  22,  156 

1  6,  7,  33,  76 

2  0,  56,  87,  12 

31  53,21,1,9 

4.  Construct  clone  pairs:  Intuitively,  regions  that  frequently  appear  together  in  the  same  buckets  of 
different  hash  tables  are  similar.  They  should  therefore  form  a  clone  pair.  The  co-occurrence  of 
regions  can  be  computed  by  simply  scanning  the  hash  tables  and  keeping  track  of  the  co-occurrence 
counts  using  a  score  table.  For  example,  for  hash  key  0  in  Table  4,  the  scores  of  {8,  9},  {8,  22},  {8, 
156},  {9,  22},  {9,  156},  and  {22,  156}  are  incremented  by  1.  Similarly,  for  hash  key  31,  the  scores  of 
{53,  21},  {53,  1},  {53,  9},  {21,  1},  {21,  9},  and  {1,  9}  are  also  incremented  by  1.  The  pairs  of 
regions  having  a  score  above  the  user-specified  threshold  minS  are  considered  as  clone  pairs. 

3.4  Exact  and  Inexact  Clone  Pairs 

BinClone  has  also  been  released  as  open  source  code  on  GitHub4  under  the  Apache  License,  Version  2.0.  Figure 
15  displays  an  example  of  an  exact  clone  it  detected  in  both  Citadel  (on  the  left)  and  Zeus  (on  the  right),  with 
their  differences  highlighted.  When  compared  with  their  corresponding  instructions  in  Zeus,  some  Citadel 
instructions  use  different  registers  (e.g.,  ebx  and  edi  at  address  4  0ADEE  and  4  0ADEF),  labels  (e.g., 
loc_40B135  at  address  4  0ADFA),  and  function  names  (e.g.,  sub_433D74  at  address  40AE59). 


4  https://github.com/BinSigma/BinClone 
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text : 0040ADE8 

text : 0O40ADEB 

lea 

push 

eax,  [ebp+uar_18] 
eax 

text : 0O40ADEC 

push 

1 

text : 0040ADEE 

rpush 

ebx  ' 

text : 0040ADEF 

push 

edi 

text : 0040ADF0 

call 

sub_433D74 

text: 004 0ADF5 

mou 

[ebp+uar_24] ,  eax 

text : 004BADF8 

test 

eax,  eax 

text : 004OADFA 

IF 

loc  40B135 | 

text : 0040AE  00 
text : 0040AE04 

cup 

PF — 

[ebp+uar  18],  0 
loc  40B135 | 

text: 004 0AE0A 
text : 0040AE0D 

lea 

push 

eax,  [ebp+uar_1C] 
eax 

text : 0O40AE0E 

push 

0 

text : 0040AE1 0 

push 

ebx 

text: 004 0AE11 

push 

edi 

text : 004OAE12 

call 

sub  433D74, 

text: 004 0AE17 
text : 0040AE1 A 

mou 

test 

[ebp+uar_28] ,  eax 
eax,  eax 

text : 004OAE1C  | 

[JZ 

loc  40B135 | 

text : 0040AE22 

cmp 

f ebp+uar  1C1 ,  0 

text : 0040AE26  | 

\u 

loc  40B135I 

1 

text : 0040AE2C 
text : 0040AE2F 

lea 

push 

eax,  [ebp+uar_14] 
eax 

text: 004 0AE30 

push 

offset  aHost 

text : O040AE35 

rpush 

ebx 

text: 004 0AE36 

push 

edi 

text : 0040AE37 

call 

sub  433D74 

text : 004OAE3C 
text : 0040AE3F 

mou 

test 

[ebp+uar_20] ,  eax 
eax,  eax 

text : 004OAE41  | 

IF 

loc  40B135 | 

text: 004 0AE47 

cmp 

[ebp+uar  14] ,  0 

text : 0040AE4B 

|jz 

loc 40B135 | 

text: 004 0AE51 
text : O040AE54 

lea 

push 

eax,  [ebp+uar_C] 
eax 

text: 004 0AE55 

push 

3 

text : 0040AE57 

push 

ebx 

text: 004 0AE58 

push 

edi 

text : 0040AE59  | 

[  call 

sub  433D74 1 

text : 0040AE5E 
text : 0040AE61 

mou 

test 

[ebp+uar_2C] ,  eax 
eax,  eax 

text : 004OAE63 

IF 

loc  40B135 | 

text : 0040AE69 
text : 0040AE6D 

and 

lea 

[ebp+uar_1 0] ,  0 
eax,  [ebp+uar_8] 

text : 0040AE7  0 

push 

eax 

text : 0040AE71 

push 

offset  aContentLenqth 

text: 004 0AE76 

push 

ebx 

text : 0040AE77 

push 

edi 

text : 0040AE78 

call 

sub  433D74, 

text : 0040AE7D 

mou 

ecx,  eax 

text : 0040AE7F 

test 

ecx,  ecx 

.text : 00416974 

lea 

eax,  [ebp+uar_18] 

.text : 00416977 

push 

eax 

text: 00416978 

push 

1 

.text : 0041697A 

push 

edi 

.text : 0041697B 

push 

ebx 

.text : 0041697C 

call 

sub  4OFD08 

.text : 00416981 

mou 

[ebp+uar_24] ,  eax 

.text : 00416984 

test 

eax,  eax 

text: 00416986 

u 

loc 416C05 

1 

.text : 0041698C 

cmp 

febp+uar  18 

1],  0 

.text : 00416990 

u 

loc  416C05 

text: 00416996 

lea 

eax,  [ebp+uar_1C] 

text: 00416999 

push 

eax 

.text : 0041699A 

push 

0 

.text : 0041699C 

push 

edi 

.text : 0041699D 

push 

ebx 

.text : 0041699E 

call 

sub  40FD08 

.text : 004169A3 

mou 

[ebp+uar_28] ,  eax 

.text : 004169A6 

test 

eax,  eax 

.text : 004169A8 

u 

loc 416C05 

i 

.text : 004169AE 

cmp 

[ebp+uar  1C 

:],  0 

.text : 004169B2 

u 

loc  416C05 

.text : 004169B8 

lea 

eax,  [ebp+uar_14] 

.text : 004169BB 

push 

eax 

.text : 004169BC 

push 

offset  aHost 

.text : 004169C1 

push 

edi 

.text : 004169C2 

push 

ebx 

.text : 004169C3 

call 

sub  40FD08 

.text : 004169C8 

mou 

[ebp+uar_20] ,  eax 

.text : 004169CB 

test 

eax,  eax 

.text : O04169CD 

U 

loc  416C05 

.text : 004169D3 

cmp 
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U 
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.text : 004169E1 
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.text : 004169E4 
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Figure  15:  Exact  clone  detected  in  both  Citadel  (left)  and  Zeus  (right). 

An  example  of  an  inexact  clone  detected  by  BinClone  between  Citadel  (on  the  left)  and  Zeus  (on  the  right)  is 
illustrated  in  Figure  16.  This  clone  is  related  to  the  RC4  function  used  for  encrypting  the  command  and  control 
(C&C)  network  traffic  between  the  bot  and  the  C&C  server. 
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push 

ecx 

mou 

cl,  [eax+1 00h] 

push 

edi 

mou 

[ebp+uar_1],  cl 

mou 

cl,  [eax+101h] 
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edi,  edi 

mou 

[ebp+uar_2],  cl 
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[ebp+arg  4],  edi 

jbe 

short  lOC_40C3D4 

push 
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mouzx 
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mou 
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add 

[ebp+uar_2],  dl 

mouzx 

ecx,  [ebp+uar_2] 

mou 
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mou 
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mou 
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mou 
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mou 
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jb 
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endp 

Figure  16:  Inexact  clone  detected  in  Citadel  (left)  and  Zeus  (right). 


4.0  CITADEL  AND  ZEUS  CASE  STUDY 

To  test  the  BinSourcerer  and  BinClone  prototypes,  a  case  study  was  conducted  using  the  Citadel  and  Zeus 
malware.  Citadel  is  an  offspring  of  Zeus,  which  has  been  a  prolific  information  stealing  Trojan  since  2007.  In 
2011,  the  Zeus  source  code  was  leaked,  resulting  in  several  new  malware  based  on  it,  one  of  them  being  Citadel. 
Citadel  has  since  been  used  by  botnet  operators  to  steal  banking  credentials  and  personal  information  [29].  The 
purpose  of  this  case  study  was  to  identify  the  open  source  components  used  in  Citadel,  reveal  the  correlation 
between  the  function-level  features  of  Citadel  and  open  source  projects,  as  well  as  quantify  the  similarities 
between  Citadel  and  Zeus. 

In  addition  to  the  video  capture  functionality  described  in  Section  2.5  and  obviously,  to  the  Zeus  source  code, 
BinSourcerer  found,  among  others,  references  to  the  following  open  source  projects  on  the  Open  Hub  Code 
Search:  RealVNC5,  Metasploit6,  Anon  Proxy  Server7,  as  well  as  an  open  source  implementation  of  the  ZipCrypto 
and  CRC32  algorithms. 

Table  5  displays  the  number  of  exact  clones  detected  between  Citadel  and  Zeus  by  BinClone  [29].  Of  the  526 
exact  Zeus  clones  found  in  Citadel,  approximately  representing  93%  of  Zeus  code,  they  form  67%  of  Citadel 


5  https://www.realvnc.com/ 

6  http://www.metasploit.com/ 

7  http  ://anonproxyserver .  sourceforge.net/ 
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code.  As  a  result,  if  a  reverse  engineer  has  a  detailed  analysis  of  Zeus,  only  33%  of  Citadel  code  remains  to  be 
analyzed,  which  is  a  considerable  amount  of  time  the  reverse  engineer  will  saved. 

Table  5:  Clone  detection  results  for  Citadel  and  Zeus. 


Malware 

Citadel  1.3.5. 1 
Zeus  2. 1.0.1 


Number  of 
Functions 


788 

565 


Window 

Size 

15 


Step 

Size 


1 


Exact 

Clones 

526 


The  above  results  are  similar  to  the  ones  obtained  by  AnhLab.  In  [7],  they  present  a  comprehensive  static 
analysis  of  Citadel,  explaining  in  details  its  infection  process,  structure,  main  functionalities,  and  features.  The 
report  mentions  that  Citadel  physically  matches  Zeus  by  approximately  75%,  without  explaining  the 
methodology  and  steps  taken  for  reaching  this  outcome,  contrary  to  the  analysis  done  using  BinClone. 


5.0  CONCLUSION 

Characterizing  the  tools  used  by  attackers  in  cyber  incidents  requires  dissecting  them  through  advanced  analysis 
techniques  to  understand  how  they  work,  which  in  turn  necessitates  reverse  engineering.  This  paper  presents  two 
such  techniques,  together  with  the  prototypes  implementing  them,  to  accelerate  the  reverse  engineering  process. 
Their  objective  is  to  partially  automate  some  of  its  aspects,  by  leveraging  the  existing  sources  of  information 
available,  namely  (i)  public  open  source  code  repositories  and  (ii)  previously  analyzed  assembly  code  fragments. 
The  first  approach  aims  at  saving  time  by  providing  the  significance  of  a  function’s  strings,  constants,  and 
imported  functions,  without  having  the  reverse  engineer  analyze  the  underlying  assembly  code.  The  second 
attempts  to  reduce  redundant  analysis  efforts  by  detecting  code  clones  of  a  target  executable.  Using  the  presented 
analysis  techniques  along  with  the  prototypes  implementing  them,  the  Citadel  malware  was  analyzed  and 
compared  with  its  predecessor  Zeus.  Their  similarities  were  quantified  and  the  results  indicate  that  the  approach 
is  promising  and  is  applicable  to  other  malware  analysis  scenarios. 
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1.  Introduction 

While  measuring  security  is  an  unsolved  and  important  area,  measuring  system  behaviors  in  terms 
of  performance  and  capability  is  a  well-established  science.  We  argue  that  measuring  security — 
and  hence  understanding  environmental  threats — relies  on  the  projection  of  system  measurements 
(detection  signals)  onto  mission  needs  and  adversarial  objectives.  Put  succinctly,  the  best  security 
metric  identifies  how  well  the  observed  system  can  achieve  its  mission  objectives.  The  best  attack 
metric  identifies  how  well  the  adversary  is  achieving  its  adversarial  goals. 

Historically,  defensive  cyber-systems  have  focused  at  identifying  attacks  based  on  observable 
system  behaviors;  this  is  the  basis  for  modern  anomaly  and  intrusion  detection  systems.  Such 
measurements  attempt  to  identify  adversarial  behavior  based  on  models  of  normal  or  aberrant 
behavior  (e.g.,  signatures).  The  goal  is  to  identify  what  attack  is  occurring  and  specifically  not  what 
impact  that  attack  has  on  the  system  or  environmental  goals.  However,  simply  identifying  attack 
type  does  not  often  provide  a  clear  view  of  what  the  goals  of  the  adversary  are,  how  the  attacks 
impacts  ongoing  mission  objectives,  or  how  its  effects  can  (or  should)  be  mitigated. 

This  paper  introduces  a  vision  for  security  that  attempts  to  infer  attack  intention  and  the  impacts  of 
an  attack  on  the  missions  in  progress,  rather  than  diagnosing  the  identity  of  the  attack  itself. 
Presented  below,  we  see  this  effort  as  breaking  down  into  two  interrelated  phases  of  analysis.  The 
first  phase  discussed  Section  1.1  posits  how  detection  signals  can  be  used  to  identify  resource  or 
performance  related  impacts  that  impact  an  active  cyber-mission.  The  second  focus  discussed  in 
Section  1.2  attempts  to  project  those  state  changes  on  a  mission  plan  described  by  an  operational 
model.  We  conclude  by  exploring  a  range  of  challenges  introduced  by  this  research  agenda. 

The  effort  highlighted  throughout  is  begin  carried  out  within  the  Cyber-Security  Collaborative 
Research  Alliance  (CSec  CRA,  or  just  CRA)  [CRA15].  The  CRA  is  a  consortium  of  academic,  military 
and  industrial  researchers  been  investigating  the  techniques  for  ensuring  mission  progress  in  the 
presence  of  adversarial  action.  The  goal  of  the  CRA  program  is  to  understand  and  model  the  risks, 
human  behaviors  and  motivations,  and  attacks  within  military  cyber-maneuvers.  The  overarching 
scientific  goal  of  this  effort  is  to  develop  a  rigorous  science  of  cyber-decision  making  that  enables 
military  environments  to  a)  detect  the  risks  and  attacks  present  in  an  environment,  b)  understand 
and  predict  the  motivations  and  actions  of  users,  defenders,  and  attackers,  c)  alter  the  environment 
to  securely  achieve  maximal  maneuver  success  rates  at  the  lowest  resource  cost. 

2.  Overview 

Figure  1  describes  a  preliminary  analysis  framework.  At  a  high  level,  we  map  attacks  onto  the 
adversarial  goals  and  impacts  on  a  system.  This  requires  us  to  manually  or  automatically  identify 
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how  an  attack  manifests  on  the  victim,  as  well  as  the  local  impacts  on  its  resources.  Once  identified, 
the  impacts  are  mapped  onto  the  mission  objectives  and  plans  to  determine  when  a  mission 
outcome  may  be  in  jeopardy.  This  analysis  is  used  in  the  context  of  a  mission  plan  to  determine 
when  an  attack  is  impacting  a  mission,  identify  where  the  impacts  of  an  attack  will  present 
problems  (now  and  possibly  later),  and  to  enable  alteration  of  mission  strategies  to  increase  the 
likelihood  of  a  positive  mission  outcome. 


Figure  1  -  Intent  and  Impact  Analysis 


1.1.  Attack  Identification  and  Intent  Analysis 

The  first  stage  in  this  approach  is  the  identification  of  system  measurements  that  can  indicate  the 
presence  of  an  attack.  This  is  the  widely  studied  detection  problem,  and  we  defer  to  the  vast 
literature  and  systems  for  solutions  that  address  them.  Here,  it  is  sufficient  to  assume  the 
identification  of  attacks.  Note  that  system  performance  measurements  may  also  be  used  to  identify 
system  state. 

The  second  stage  is  to  relate  those  known  attacks  to  impacts  onto  intents.  Here,  we  define  an 
intent  of  an  attack  as  a  set  of  one  or  more  impacts  (e.g.,  availability,  integrity,  confidentiality,  or 
performance)  on  resources  (targets).  Note  that  an  attack  can  have  multiple  intents.  Initially,  we  will 
hand-label  intents  based  on  the  documented  behaviors  of  the  known  attacks  as  well  as  our 
experimental  observations.  In  the  longer  term,  we  seek  to  infer  intents  based  on  system 
measurements.  Such  inference  can  be  difficult  because  causality  in  complex  systems  is  inherently 
vague  and  often  unknowable  from  simple  measurements. 

This  investigation  of  intent  is  similar  the  investigation  of  attack  strategies.  For  example,  attack 
trees  are  a  means  of  creating  structured  models  enumerating  the  ways  that  attacks  can  be  used  in 
concert  to  achieve  a  particular  adversarial  goal  [Sch99].  Other  methods  of  modeling  attackers  used 
attack  patterns  [HM04,  GW05]  which  was  developed  from  fault  analysis  techniques  in  aviation  and 
nuclear  power  systems. 

One  interesting  question  that  comes  about  from  this  effort  is  what  exactly  are  the  scope  and 
semantics  of  resources  and  impacts.  One  approach  is  to  develop  ontologies  [Gru95,  Gua98, 
0CWD14]  for  resources  and  impacts.  Such  ontologies  provide  a  way  of  articulating  these  features 
at  different  levels  of  abstraction  and  granularity.  To  see  why  this  is  necessary,  consider  two  kinds 
of  network-based  denial  of  service  attacks.  Attack  A  floods  the  network  interface  of  a  victim 
machine  with  large  packets,  while  attack  B  sends  many  TCP  syn-requests  that  consume  entries  in 
the  operating  system  connection  table.  The  intents  of  these  attacks  are  similar  (reduce  the  network 
performance),  but  have  vastly  different  vectors  and  consequences  (consume  bandwidth  vs. 
preventing  successful  incoming  connections). 
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1.2.  Mission  Impact  Analysis 

One  of  the  areas  of  concentration  within  the  CRA  is  the  development  of  formalism  for  describing 
mission  plans  and  strategies  called  an  operational  model.  To  simplify,  an  operational  model  is  an 
annotated  finite  state  machine  that  describes  transitions  (maneuvers)  that  can  be  undertaken  to 
move  a  cyber-mission  from  an  initial  state  (start)  to  an  end  state.  Figure  2  shows  a  partial  example 
of  a  mission  as  represented  in  the  substantially  simplified  operational  model.  This  example  mission 
implements  a  generic  request/response  exchange  relating  to  the  acquisition  of  data  through  a 
series  of  discrete  steps.  Each  state  in  the  model  is  annotated  with  a  set  of  preconditions  that 
represent  requirements  for  a  state  to  be  reached.  Importantly,  the  preconditions  are  formulated  as 
an  expression  over  the  resource  states  that  are  affected  by  attacks.  This  allows  us  to  track  the 
system  state  changes  over  time,  and  importantly  how  an  attack  impacts  the  mission. 


Send  Request  Transmisson  Finish  Transmit 


Figure  2  •  Simplified  Operational  Model  Example 


Focusing  on  the  example,  the  states  "waiting  transmission”  and  "transmit  complete"  have  practical 
preconditions  for  its  operation.  The  "waiting  transmission”  state  can  only  be  reached  if  the  source 
from  which  the  data  is  acquired  is  reachable  and  is  receiving  requests.  Further,  "transmit 
complete"  state  can  only  be  reached  if  connectivity  is  maintained  and  there  is  sufficient  bandwidth 
to  support  the  entire  transmission. 

Attack  intents  allow  us  to  reason  about  progress  of  an  environment  executing  a  mission  using  the 
operational  model.  Once  detected,  we  can  formally  reason  about  the  effects  of  the  impacts  on  the 
preconditions  of  the  operational  model  states  by  evaluating  the  precondition  expressions  over  the 
resource  states.  That  is,  the  impacts  restrict  the  set  of  reachable  states  by  making  the  preconditions 
unsatisfiable.  To  see  why,  consider  again  the  execution  of  attacks  A  (packet  flood)  and  B  (syn 
floods)  in  executing  the  sample  data  acquisition  mission.  Attack  A  does  not  prevent  the  system 
from  entering  into  a  wait  state  because  it  restricts  the  bandwidth  but  allows  the  request  to  connect 
(with  some  probability).  However,  such  an  attack  would  prevent  the  process  from  reaching  the 
desired  end  state  (transmit  complete)  because  there  is  not  sufficient  bandwidth  to  complete  the 
transfer.  Conversely,  attack  B  would  prevent  the  wait  state  from  ever  being  reached  and  therefore 
the  mission  would  fail. 

There  are  several  advantages  to  this  approach  intent-impact  analysis.  First,  an  observer  can 
determine  whether  a  mission  can  be  completed  successfully  in  the  presence  of  an  attack  before  an 
impact  is  realized.  In  the  case  of  the  above  example,  a  system  under  attack  A  would  know  that 
bandwidth  needed  later  would  is  not  available  and  would  never  send  a  request  in  the  first  place. 

Second,  this  analysis  provides  for  missions  to  alter  their  mission  strategies  when  it  is  determined 
that  a  mission  end-state  is  not  achievable.  In  this  case,  the  analysis  could  identify  alternate  paths 
through  the  state  machine  that  would  arrive  at  the  end  state.  For  example,  the  operation  could 
employ  countermeasures  to  mitigate  the  effects  of  the  attack.  In  the  case  a  new  state  could  be 
introduced  that  enables  syn  puzzle  countermeasures  as  a  precondition  to  the  "protected”  wait  state. 
In  this  way,  the  model  can  codify  responses  to  adversarial  action  and  predict  future  progress. 
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3.  Research  Challenges 

Reasoning  about  attack  intent  and  mission  impacts  introduces  a  number  of  intriguing  research 

issues.  These  include: 

•  Understanding  how  to  represent  intent,  at  what  level  of  granularity,  and  how  large  is  an 
open  issue.  While  ontology  development  will  help,  a  clear  understanding  of  these  issues  can 
only  come  about  through  experimental  and  operational  experience. 

•  Determining  causality  and  intent  of  an  attack  is  difficult.  For  example,  it  is  often  difficult  to 
determine  the  difference  between  intended  system  behavior  (e.g.,  excess  CPU  load  based  on 
local  workloads)  and  adversarial  actions. 

•  New  attacks  will  exploit  new  systemic  features.  It  is  our  expectation  that  intents  will 
remain  largely  the  same  (once  we  have  evaluated  a  sufficiently  large  sample  of  attacks). 

Yet,  this  hypothesis  needs  to  be  confirmed. 

The  answers  to  all  of  these  questions  will  be  the  substance  of  the  CRA  research  efforts  in  the  coming 

years. 
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Abstract 

The  increasing  interdependency  of  the  physical  power 
grid  and  Information  Communication  Technologies 
(ICT)  has  presented  many  new  research  challenges.  The 
primary  focus  within  the  power  grid  is  to  ensure  that  cus¬ 
tomers  are  continuously  supplied  with  electricity.  This 
is  the  mission  of  an  electrical  power  grid.  Securing 
mission-critical  infrastructure  requires  assessing  the  im¬ 
pact  of  an  event  in  the  ICT  domain  on  the  physical  power 
grid.  A  mission  impact  assessment  (MIA)  serves  multi¬ 
ple  purposes,  which  allow  simultaneously  serving  event 
correlation,  the  recognition  of  mission  threatening  events 
and  computing  the  impact  of  this  event.  The  method¬ 
ology  developed  in  the  context  of  this  work  analyzes 
re-occurring  behavioral  communication  patterns  of  the 
supporting  IT  infrastructure  and  maps  them  to  physical 
tasks.  This  mapping  allows  the  analysis  of  how  cyber 
events  might  impact  the  ongoing  mission  on  an  opera¬ 
tional  level. 

1  Introduction 

Industrial  control  systems  (ICS)  often  perform  mission 
or  safety-critical  functions  to  operate  infrastructure  for 
electricity  generation  and  as  such  are  at  the  heart  of  crit¬ 
ical  national  infrastructure.  However,  ICS  that  monitor 
and  operate  critical  industrial  infrastructure  worldwide 
are  subject  to  an  increasing  frequency  of  cyber  attacks. 

The  reason  for  this  is  a  continuous  evolution  of  the 
ICS  environment  to  include  standard  operating  system 
platforms  and  allow  connectivity  to  corporate  LANs. 
Whereas  in  the  past  the  ICS  environment  were  insulated 
from  the  outside  world  by  a  closed,  trusted  network.  The 
result  is  legacy  systems  and  component  devices  exposed 
to  modern  external  threats  with  weak  or  non-existent  se¬ 
curity  mechanisms  in  place. 

Instead,  SCADA  systems  must  have  tools  in  place 
that  allow  them  to  identify  what  event  pose  a  threat  to 


the  power  grid,  respond  to  events  and  expedite  analy¬ 
sis  in  real  time.  To  achieve  this,  continuous  monitor¬ 
ing  of  all  log  data  generated  by  SCADA  components  is 
needed  to  automatically  baseline  normal,  day-to-day  ac¬ 
tivity  across  these  components  and  therefore  identify  any 
and  all  anomalous  activity  immediately. 

On  an  operational  level  an  electrical  grid  is  a  network 
of  power  providers  and  consumers  that  are  connected  by 
transmission  and  distribution  lines.  Hence,  the  mission 
of  an  electrical  power  grid  is  to  deliver  electricity  from 
suppliers  to  consumers.  For  monitoring  purposes,  they 
are  additionally  connected  to  IT  infrastructures.  In  the 
past  power  system  IT  infrastructures  used  to  be  isolated, 
stand-alone  systems.  However,  they  are  increasingly  in¬ 
tegrated  with  other  IT  infrastructures  at  power  utilities, 
including  public  infrastructures  in  order  to  increase  busi¬ 
ness  efficiency  and  effectiveness  and  reduced  operational 
costs.  Especially,  the  development  of  trustworthy  smart 
grid  requires  a  deeper  understanding  of  potential  impacts 
resulting  from  successful  cyber  attacks.  Estimating  fea¬ 
sible  attack  impact  requires  an  evaluation  of  the  grid’s 
dependency  on  its  cyber  infrastructure  and  its  ability  to 
tolerate  potential  failures.  In  the  following,  we  define 
physical  tasks  in  the  context  of  power  grids  as  all  tasks 
that  strictly  rely  on  physical  power  grid  components  and 
their  local  power  applications.  In  the  context  of  this  work 
the  understanding  of  what  constitutes  a  mission  is  analog 
to  Barreto  [2].  In  order  to  understand  the  significance  of 
a  cyber  event  for  a  mission  requires  mapping  physical 
tasks  to  their  supporting  infrastructure.  This  allows  an 
integrated  view  of  cyber  and  physical  behavior. 

1.1  Motivation 

Currently,  conventional  network  security  approaches  fo¬ 
cus  on  perimeter  protection  instead  of  identifying  the 
most  business  critical  assets  and  protect  those.  Stuxnet 
or  Flame  have  taught  us  that  in  order  to  protected  critical 
infrastructures  against  these  advanced  persistent  threats, 
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perimeter  protection  simply  is  not  enough.  In  order  to 
guarantee  the  safety  of  critical  infrastructures,  we  need  to 
guarantee  their  security,  too.  Security  assurance  in  cyber¬ 
physical  systems  means  guarding  the  pathways  into  the 
physical  domain.  A  pathways  into  the  physical  domain 
doesn’t  have  to  be  only  remote  access,  it  might  be  an 
USB  flash  drive,  CD  or  laptop  that  technicians  load  docu¬ 
ments  on  and  carry  on  to  the  plant  floor.  To  underpin  this 
statement  we  refer  to  the  director  Sean  McGurk  of  the 
National  Cybersecurity  and  Communications  Integration 
Center  (NCCIC)  at  the  Department  of  Homeland  Secu¬ 
rity  [19]: 

“In  our  experience  in  conducting  hundreds  of  vulner¬ 
ability  assessments  in  the  private  sector,  in  no  case  have 
we  ever  found  the  operations  network,  the  SCADA  sys¬ 
tem  or  energy  management  system  separated  from  the 
enterprise  network.  On  average,  we  see  1 1  direct  connec¬ 
tions  between  those  networks.  In  some  extreme  cases, 
we  have  identified  up  to  250  connections  between  the 
actual  producing  network  and  the  enterprise  network.” 

2  Physical  Power  Grid 

The  system  state  of  the  physical  power  grid  may  be  writ¬ 
ten  as 

x=[V,0\,  (1) 

with  a  vector  of  voltages  V  and  a  vector  of  voltage  angles 
0.  The  vector  of  active  power  loads  is  denoted  as  Pl 
and  the  vector  of  reactive  power  loads  Ql .  The  vector  of 
active  power  generators  is  denoted  as  Pg  and  the  vector 
of  reactive  power  generators  Q8. 

As  the  components  of  the  physical  power  grid  are  only 
operational  within  a  particular  value  window,  a  separate 
vector  inv  collects  these  operational  constraints.  These 
constraints  are  due  to  generator  outputs  being  partially 
controllable  and  inv  collects  these  generator  controls. 
Among  these  generator  controls  collected  within  the  vec¬ 
tor  inv  is  a  generator’s  maximum  and  minimal  reactive 
power  capabilities  as  Qmax  and  Qmin.  Similarly,  we  de¬ 
note  a  generator’s  real  power  capabilities  as  Pmax  and 
Pmin.  Also,  the  vector  inv  collects  other  constraints  such 
as  the  maximum  line  capacity  c™ax  of  a  line  connecting 
bus  bi  and  bj.  In  the  following,  we  assume  all  lines  to  be 
numbered  and  therefore  refer  to  maximum  capacity  of  a 
line  as  c™ax. 

If  a  constraint  within  inv  is  violated,  this  leads  to  a 
control  action  ui  G  £  to  be  taken,  which  modifies  the  state 
of  the  overall  physical  power  grid.  It  follows  from  this 
that  the  power  flow  can  be  written  as  a  complex  vector 

f(x,  inv ,  u)  =  0  (2) 

representing  the  power  injection  at  each  node  in  the  sys¬ 
tem.  Equation  2  represents  the  physical  power  grid  and 


can  be  broken  into  active  ff  and  reactive  parts  ff  for  a 
particular  bus  i. 

To  model  the  physical  power  grid  [18],  we  rely  on 
graph  theory  to  perform  a  limited  information  topol¬ 
ogy  based  contingency  analysis  that  defines  the  outgoing 
power  at  a  particular  bus  i.  The  active  power  injection  Pj 
at  bus  i  is  described  by 

N 

fP  = ,  pf  •  pj  |  £  ViWVjKGijcosQij  •  BjjSinOij)  (3) 
j=  1 

and  the  reactive  power  injection  Qj  at  bus  i  is  denoted  as 

N 

ft  =  -Qsi+Q\  +  Yd\Vi\\Vj\(.GijSinOij-BijcosOij).  (4) 
7=1 

Whereas  the  variables  Bij  denotes  the  imaginary  part  of 
the  element  of  the  bus  admittance  matrix  defining  the  ad¬ 
mittance  between  buses  i  and  j.  Likewise,  Gij  denotes 
the  real  part  of  the  element  of  the  bus  admittance  matrix. 

Active  and  reactive  power  are  equally  important  for 
maintaining  a  continuous  power  supply.  Active  power 
is  the  energy  required  to  deliver  energy  to  the  end  user 
and  allow  the  user  to  for  example  heat  a  home  or  run  a 
motor.  Reactive  power  allows  the  regulation  of  voltage. 
The  role  of  voltage  is  that  if  voltage  on  the  system  is  too 
low,  active  power  cannot  be  supplied.  Reactive  power  is 
essential  to  move  active  power  through  the  transmission 
and  distribution  system  to  the  customer.  So  the  reactive 
power  Qi  at  a  bus  i  is  important  for  the  active  power  Pj 
at  the  same  bus  i.  Yet,  the  magnitude  of  Qi  does  not 
contribute  to  the  significance  of  a  bus  i  to  the  entire  power 
grid.  Hence,  we  have  come  to  the  conclusion  to  only 
consider  the  active  power  injection  PL  to  determine  the 
significance  of  a  bus  i. 

2.1  Electrical  node  significance 

To  assess  the  electrical  significance  l?  of  bus  i,  we  rely  on 
a  node  centrality  measure  designed  specifically  for  power 
grids  [10].  The  measure  is  based  on  the  active  power 
injection  Pi 


which  is  normalized  over  the  total  number  of  lines  N  in 
the  network.  The  node  significance  addresses  the  fact 
that  some  buses  deal  with  a  larger  amount  of  powers, 
while  other  nodes  distribute  a  relatively  small  amount  of 
powers.  Hence,  if  a  failure  occurs  at  a  link  that  origi¬ 
nates  at  a  highly  significant  bus,  a  significant  amount  of 
power  is  exposed  to  the  remainder  of  the  network.  Redis¬ 
tributing  the  excess  power  of  the  failed  link  over  adjacent 
components  may  eventually  cause  further  link  overload 
failures. 
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2.2  Contingency  Analysis 


2.3  Hybrid  Automaton 


For  the  sake  of  simplicity,  this  paper  assumes  a  deter¬ 
ministic  model  for  the  line  tripping  mechanism.  In  other 
words,  a  circuit  breaker  for  a  line  trips  at  the  moment 
the  flow  of  the  line  exceeds  its  rated  limit.  In  case  of 
islanding,  cascading  failures  continue  in  each  island  in 
which  generators  or  loads  are  shed  respectively  to  attain 
a  supply-demand  balance. 

In  order  to  quantify  the  severity  of  system  failure,  we 
rely  on  contingency  analysis.  This  approach  removes 
power  lines  from  the  topology  and  assess,  whether  this 
leads  to  a  cascading  failure.  The  bigger  the  cascading 
failure,  the  more  important  a  particular  line  is  for  the 
overall  mission.  The  mission  of  a  power  grid  is  to  con¬ 
tinuously  supply  electricity  to  all  customers.  By  quan¬ 
tifying  the  severity  of  system  failure,  equation  6  is  the 
basis  for  quantifying  the  mission  impact  that  cyber  and 
physical  faults  may  have  on  the  overall  system  state.  The 
damage  caused  pruning  line  j  is  quantified  by  the  follow¬ 
ing  equation: 


/'- 1 


r:=- 


Ci 


Yjrlci 


(6) 


with  the  total  number  of  links  /,  the  number  of  still  oper¬ 
ational  links  /'  and  the  capacity  C{  of  a  line  i.  Equation  6 
quantifies  the  mission  impact  Xj  that  removing  line  j  will 
have  on  the  entire  power  grid. 

To  assess  how  critical  a  bus  k  is  for  the  mission  of 
continuously  supplying  power,  we  summarize  the  dam¬ 
age  caused  by  pruning  all  outgoing  lines  4  as  {0, . . .  ,N} 
at  bus  k.  This  is  done  with  the  following  equation: 


N 

4=0 


(7) 


where  /4  is  the  mission  impact  that  removing  bus  k  from 
the  power  grid  topology  will  have  on  the  overall  power 
grid. 

1 .  Based  on  Equation  5  select  a  highly  significant  bus  i 
and  consecutively  choose  a  line  and  remove  it  from 
the  topology. 

2.  Update  corresponding  element  of  the  bus  admit¬ 
tance  matrix  Uy 

3.  Re-compute  power  flow  equation  given  by  Equa¬ 
tion  3  and  4. 

4.  Check  the  connectedness  of  the  power  grid  as  in 
case  of  islanding,  cascading  failures  continue  sep¬ 
arately  in  each  island. 

5.  Check  the  flow  limit  violations  of  the  transmission 
lines.  If  the  flow  value  of  a  transmission  line  ex¬ 
ceeds  its  rated  limit,  label  the  corresponding  line  as 
pruned  lines,  and  repeat  steps  2,  3,  and  4. 

6.  Compute  damage  caused  by  the  cascading  effect  ac¬ 
cording  to  equation  6. 


To  model  the  physical  power  grid,  we  rely  on  hybrid  au¬ 
tomaton  to  capture  the  characteristics  that  were  derived 
in  the  previous  subsections.  Hybrid  automaton  allow  us 
to  quantify  the  mission  impact(Eq.  7)  that  the  current  sys¬ 
tem  state  (Eq.  6  and  Eq.  1)  has  on  the  overall  system. 
This  is  done  via  the  contingency  analysis  described  in 
Subsection  2.2. 

Definition  1  A  hybrid  automaton  is  defined  as  a  tuple 
<  Q,X, initfL, inv ,  f,T>  where 

Q  =  •  -An}  Is  the  finite  set  of  states  of  the 

automaton, 

X  =  {vo,vi,  •  •  •  ,xn}  is  the  set  of  continous  system 
state  variables  in  M  that  can  be  seen  in  Equation  1, 
init  =  <2o  x  is  the  set  of  initial  conditions  , 

£  =  {wo,  u\ , . . . ,  un}  is  a  finite,  discrete  set  that  rep¬ 
resent  discrete  changes  of  control  mode  in  the  phys¬ 
ical  power  grid  (i.e.  load  shedding), 
inv  represents  invariants  that  must  apply  for  every 
particular  state  qi  £  Q. 

f(x,  inv ,  u)  =  0  is  the  continous  state  associated  with 
each  discrete  state  qi  £  Q  as  seen  in  Equation  2, 

T  :  Q  x  X  x  £  — >>  2^xX  is  the  transition  map 


© 


:  Generators 

;  Synchronous  condensers 


Figure  1:  IEEE  14  bus  test  system 

Figure  1  shows  the  IEEE  14  busses  test  system,  which 
was  used  as  a  test  case  in  this  paper.  Table 

3  Communication  Network 

An  accumulation  of  all  routine  process  within  a  commu¬ 
nication  network  can  be  seen  in  Figure  3.  While  such  a 
textual  representation  of  a  communication  network  can 
only  be  provided  via  human  input,  the  communication 
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Figure  2:  Typical  electric  communication  network  diagram  taken  from  [9] 


patterns  within  a  network  can  be  derived  based  on  net¬ 
work  traffic. 


Table  1:  Mission-criticality  ranking  of  critical  power 
lines  in  the  IEEE  14  bus  test  system 


Rank 

From  Bus 

To  Bus 

1 

7 

9 

2 

6 

13 

3 

8 

7 

4 

6 

11 

5 

5 

4 

6 

11 

10 

7 

1 

2 

8 

2 

4 

9 

14 

13 

10 

3 

2 

11 

6 

12 

12 

4 

7 

13 

4 

9 

14 

2 

4 

15 

2 

5 

16 

12 

13 

17 

3 

4 

18 

5 

6 

19 

9 

10 

20 

9 

14 

Relying  on  a  network  operator  to  model  monitored  in¬ 
frastructures  processes  is  an  error-prone  task,  as  these 
process  are  subjected  to  frequent  change.  Also,  a  net¬ 
work  operator  might  not  have  complete  knowledge  of  all 
process  in  the  monitored  infrastructure.  Acquiring  in¬ 
formation  based  on  human  input  means  that  trust  in  the 
completeness  and  accurateness  of  the  provided  informa¬ 
tion  is  required.  Also,  models  acquired  based  on  human 
input  cannot  automatically  adopt  to  a  changing  environ¬ 
ment.  Hence,  an  automatic,  machine  learning  based  ap¬ 
proach  to  obtaining  a  model  of  the  communication  net¬ 
work  is  sought  for  in  the  context  of  this  work. 

Lets  assume  that  we  have  complete  knowledge  of  the 
monitored  infrastructure  by  knowing  all  processes  taking 
place  and  being  able  to  record  all  occurring  network  traf¬ 
fic  over  an  extend  period  of  time.  A  closer  analysis  of 
this  network  traffic  would  show  reoccurring  communica¬ 
tion  patterns.  By  grouping  network  traffic  according  to 
the  media  access  control  (MAC)  addresses  of  the  moni¬ 
tored  infrastructure,  reoccurring  communication  patterns 
can  found.  Hence,  we  come  to  the  conclusion  that  com¬ 
munication  patters  of  a  network  can  be  used  to  deduce  a 
communication  network. 
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Intern  et 


Contr&l  Center 


Figure  3:  Routine  operations  within  an  electrical  communication  network 


3.1  Communication  Protocols 

Within  the  energy  sector  there  are  two  protocols  that 
are  widely  used:  the  distributed  networking  protocol 
3.0  (DNP3)  [7]  that  is  currently  the  predominant  stan¬ 
dard  used  in  North  American  power  systems  and  IEC 
61850  that  is  recently  standardized  for  modern  power 
substation  automation  by  the  International  Electro  tech¬ 
nical  Commission  (IEC).  IEC  61850  is  based  on  stan¬ 
dard  Ethernet  technologies  to  enable  applications  with 
critical  real  time  requirements  in  substation  automation 
systems.  As  the  power  grid  is  increasingly  intercon¬ 
nected  there  are  different  types  of  network  traffic  proto¬ 
cols  (HTTP,  SNTP,  SSH,  Modbus,  ProfiBus,  IEC  60870- 
5-103,  DNP3)  that  may  occur  within  a  monitored  critical 
infrastructure.  This  why  the  network  model  relied  on  in 
the  context  of  this  work  needs  to  be  able  to  monitor  dif¬ 
ferent  types  of  protocols. 


Figure  4:  Expected  communication  pattern 


In  the  following  we  define  a  deterministic  finite  au¬ 
tomaton  of  the  communication  patterns  of  protocols  uti¬ 
lized  by  ICT  devices  within  the  power  grid.  An  exam¬ 
ple  for  a  communication  pattern  by  the  protocol  DNP3  is 
shown  Figure  4. 

Definition  2  A  deterministic  finite  automaton  (DFA)  is 
defined  as  a  tuple  <  Q^Z,^,  0, 8  >  where 

Q  =  {go,gi,  •  •  •  ,4n}  is  the  finite  set  of  states  of  the 
automaton,  corresponding  to  the  ICT  devices  in  the 
communication  network 

S  =  {soAl,  •  •  •  Am}  Is  a  finite  set  of  states  for  every 
qi  £  Q,  which  constitutes  the  current  state  of  an  ICT 
devices 

Z  =  {po,Pi ,  •  •  •  iPo}  is  the  finite  set  of  protocols  de¬ 
tected  within  the  communication  network 
<t>  =  {0o,  0i ,  •  •  • ,  (f)p}  is  a  finite  set  of  distinct  packet 
types  ( function  codes,  protocol  codes)  for  ever  pi  £ 

Z. 

0  =  {Oo,6i,...,Oq}  is  a  finite  set  of  events,  which 
range  from  events  and  alerts  from  intrusion  detec¬ 
tion/protections  systems  to  events  encoded  within  a 
protocol  itself 

SCQxSxlLx<&x®xQxS  is  a  transition  rela¬ 
tion 

These  communication  protocol  patterns  that  are  repre¬ 
sented  by  a  DFA  of  all  protocols  utilized  by  ICT  devices 
tend  to  become  quite  large,  hence  in  the  following  we  are 
only  able  to  demonstrate  the  derivation  in  an  exemplary 
fashion.  To  substantiate  this  claim  just  consider  the  set  of 
protocols  Z  =  {  (HTTP,  SNTP,  SSH,  Modbus,  ProfiBus, 
IEC  60870-5-103,  DNP3)  }  that  may  be  used  with  the 
communication  network  of  a  power  grid.  Hence,  to  ex¬ 
emplify  the  communication  model,  in  the  following  we 
show  the  derivation  of  the  DFA  based  on  the  application 
layer  of  the  DNP3  protocol.  The  DNP3  application  layer 
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Table  2:  Application  request  and  response  format 


Request  header 


Application  Control 

Function 

Code 

Response  header 


Application  Control 


Function  code 


Internal  Indi¬ 
cation  IIN 


Table  4:  Set  of  states  S'  =  {^0,^1, ...  ,^3}  for  an  ICT  de¬ 
vices  qt  G  2,  which  corresponds  by  the  protocol  DNP3 

GZ _ 

Set  of  states  S  =  {so^i, . . .  ,^3} 


so 

Table  3:  Packet  types  d>  for  the  protocol  DNP3  G  Z 
Function  Codes 

Request  s\ 


0x0 

Confirm 

0x1 

Read 

0x2 

Write 

0x12 

Stop  applications 

0x15 

Disable  unsolicited 

Responses  5,3 


0x81 

Response 

0x82 

Unsolicited  response 

Request 


Response 


Idle 

Failure 


When  a  message  with  a  function 
code  fc  is  sent,  the  devices  enters 
the  Request  state,  while 
Once  the  addressed  device  replies 
to  the  request  with  the  same  func¬ 
tion  code  fc  and  returns  a  value,  the 
device  enters  the  Response  state, 
while  processing  the  information. 
After  processing  information  or,  the 
device  enters  an  idle  state. 

If  an  event  or  transition  that  is  not 
allowed  occurs,  the  connection  en¬ 
ters  the  Failure  state 


request  and  response  format  is  shown  in  Table  2.  Based 
on  the  application  layer  of  DNP3  we  extract  the  packet 
types  Z  for  the  protocol. 

Table  3  shows  an  extract  of  the  packet  types  for  the 
protocol  DNP3  G  Z.  DNP3  relies  on  function  codes  to 
specify  the  purpose  of  a  request  and  response  message. 
The  function  codes  include  reads,  writes,  start  applica¬ 
tion^  11),  stop  application(0xl2),  administrative  and 
diagnostic  purposes.  Many  function  codes  can  have  sig¬ 
nificant  security  impacts  such  as  false  writes  (0x02),  stop 
application  (0x12),  and  disable  unsolicited  (0x15).  In¬ 
ternal  indications  are  two  bytes  that  communicate  useful 
information  about  an  outstation  unit  to  the  master.  Each 
bit  has  a  specific  meaning  and  is  updated  in  every  reply 
message.  This  information  is  a  part  of  the  application 
header  of  a  DNP3  packet. 

Based  on  the  packet  types  d>,  the  set  of  states  for  an 
ICT  devices  qt  G  Q ,  which  corresponds  via  DNP3  iriL. 
The  set  of  states  S  is  textually  described  in  Table  4.  Based 
on  the  set  of  states  S  and  the  transition  relation  <5,  the 
generalized  state  transition  system  is  shown  in  Figure  5. 

An  excerpt  of  events  0,  which  are  encoded  within 
DNP3  itself  is  shown  in  Table  5.  Internal  indications 
(IIN)  LSB  and  MSB  are  only  included  in  responses  from 
remote  stations  (see  Figure  3).  Events  0  also  include 
unknown  events  from  intrusion  detection  and  protec¬ 
tion  systems  within  the  monitored  critical  infrastructure. 
These  events  do  not  need  to  be  known  previously,  how¬ 
ever  it  is  required  that  they  can  be  assigned  to  one  or 


Table  5:  Excerpt  of  events  0  =  {Oo,  0\ , . . . ,  0q}  encoded 
within  the  application  header  of  the  protocol  DNP3  G  Z 
Internal  Indications 


LSB 


IIN1.0 

All  stations 

IIN  1 . 1 

Class  1  events 

IIN1.2 

Class  2  events 

IIN1.3 

Class  3  events 

IIN  1 .4 

Need  time 

MSB 


IIN2.0 

Function  code  not  supported 

IIN2.5 

Configuration  corrupted 
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Figure  5 :  State  Transition  System  for  the  states  textually 
described  in  Table  4 

more  monitored  ICT  devices.  All  these  events  that  have 
been  assigned  to  one  or  more  ICT  devices  are  internal 
reported  in  order  to  be  further  analyzed  for  their  impact 
on  the  overall  system. 

3.2  Network  Structure 

The  main  purpose  of  the  mission  impact  model  devel¬ 
oped  in  the  context  of  this  work  is  to  quantify  the  impact 
of  cyber  events  on  the  overall  mission  of  the  power  grid. 
Hence,  we  have  to  look  into  the  close  link  between  infor¬ 
mation  and  communications  technology  devices  (ICT) 
and  the  physical  power  grid.  A  power  grid  is  a  system 
of  systems,  where  ICT  and  physical  power  grid  are  tied 
together  via  control  loop  feedback  mechanisms.  An  ex¬ 
emplified  control  loop  feedback  mechanism  is  shown  in 
Figure  6.  This  mechanism  constitutes  the  basic  behav¬ 
ioral  operating  unit  of  a  system  of  systems  [13]. 


Control  Center 

f  i 

i  Computation  and  Analysis  J 

Remote/Local  Control  Data  Acquisition 

Y  _ 

(Actuator  ]- Control -^Machine/ Device  }-Measurement->j  Sensor  1 


Figure  6:  control  loop  within  a  power  grid 


3.3  Security  Metrics 

Figure  6  shematically  shows  a  control  loop  within  the 
power  grid.  Purpose  of  a  control  loop  is  to  conceptual¬ 
ize  monitoring  and  controlling  the  dynamic  behavior  of 


a  system.  Data  acquisition  relies  on  sensors  to  observe 
the  state  of  the  power  grid.  For  remotely  controlling  the 
state  of  the  power  grid,  this  sensor  information  is  ana¬ 
lyzed  within  the  control  center.  Based  on  this  analysis 
control  commands  are  sent  to  actuators.  These  actuators 
control  equipment  of  the  power  grid.  Based  on  modern 
control  system  theory,  the  control  loop  is  widely  within 
industrial  control  systems  and  introduces  the  concept  of 
controllability  and  observability  of  a  dynamical  system, 
when  actuator  or  sensor  signals  are  under  attack.  This 
is  why  we  rely  on  the  concept  of  observability  and  con¬ 
trollability  to  distinguish  different  categories  of  network 
traffic. 

The  concept  of  observability  and  controllability  can 
easily  be  explained  based  on  Figure  6.  Sensor  measure¬ 
ments  are  needed  to  observe  the  state  of  the  power  grid. 
The  concept  of  observability  refers  to  the  necessity  of  the 
power  grid  to  be  observable  to  the  operators  controlling 
it.  This  means  that  the  veracity  and  timeliness  of  sen¬ 
sor  measurements  acquired  within  the  data  acquisition 
is  essential  to  detect  any  unforeseen  or  anomalous  situa¬ 
tion.  After  analyzing  the  acquired  data,  remote  and  local 
control  commands  are  transmitted  to  actuators  within  the 
physical  power  grid.  Ensuring  controllability  means  en¬ 
suring  that  control  commands  are  transmitted  correctly 
and  on  time.  Controllability  describes  the  need  of  the 
monitored  infrastructure  to  be  able  to  react  to  various  sit¬ 
uations  that  may  arise  appropriately  at  all  times. 

3.4  Cyber  Attack 

A  cyber  attack  on  the  physical  power  grid  can  be  clas¬ 
sified  into  two  different  categories:  manipulation,  inter¬ 
ception  or  replay  of  sensor  measurements  and  manipula¬ 
tion,  interception  or  replay  of  control  commands.  A  cy¬ 
ber  attack  has  the  goal  of  having  an  impact  on  the  phys¬ 
ical  power  grid.  Manipulation  of  sensor  measurements 
affects  the  observability  of  the  physical  power  grid.  Sen¬ 
sor  measurements  report  the  current  state  of  components 
of  the  physical  power  grid.  Given  these  components, 
the  observability  impact  of  manipulated  sensor  measure¬ 
ments  is  quantified  based  on  Equation  6  and  7.  This  in¬ 
dicates  an  operator  on  the  amount  of  unobservable  load. 
The  same  holds  true  for  manipulated  and  dropped  con¬ 
trol  commands.  We  assume  that  these  commands  can  be 
linked  to  components  of  the  physical  power  grid.  On  this 
basis  we  quantify  the  mission  impact  that  these  events 
have  based  on  Equation  6  and  7.  This  indicates  an  oper¬ 
ator  on  the  amount  of  uncontrolable  load. 

4  Conclusion 

Summarizing,  it  can  be  said  that  the  mission  of  an  elec¬ 
trical  power  grid  is  to  ensure  customers  are  continuously 
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supplied  with  electricity.  The  mission  impact  model  de¬ 
veloped  in  the  context  of  this  work  allows  the  assess¬ 
ment  of  impact  of  an  event  within  the  ICT  domain  on 
the  physical  power  grid.  We  have  presented  an  approach 
that  quantifies  the  mission  impact  of  events  on  the  over¬ 
all  power  grid.  Additionally,  an  automated  approach  was 
developed  in  the  context  of  this  work  does  not  rely  on  an 
operator’s  input  to  provide  a  mission  model. 

4.1  Related  Research 

Related  research  can  be  divided  into  research  of  Mis¬ 
sion  Impact  Assessment  (MIA)  and  critical  infrastructure 
analysis. 

Mission  Impact  Model 

The  concept  of  MIAs  was  developed  in  military  research 
and  is  sometimes  also  referred  to  as  mission-centricity 
in  cyber  security.  [6],  [15],  [8]  and  [17]  all  propose  dis¬ 
tinct  mission-centric  approaches  to  cyber  security.  [11] 
proposed  a  framework  for  cyber  attack  modeling  and  im¬ 
pact  assessment  in  order  to  allow  risk  analysis  by  gen¬ 
erating  attack  graphs  and  calculating  security  metrics. 
Another  approach  was  proposed  by  [20],  who  concep¬ 
tualized  mission-centric  cyber-security  as  a  convex  opti¬ 
mization  problem. 

Critical  Infrastructure  Analysis 

Before  considering  protective  measures  for  critical  in¬ 
frastructures,  it  is  necessary  to  understand  the  function¬ 
ing  of  an  infrastructure  and  identify  critical  processes. 
This  is  why  infrastructure  analysis  is  crucial  in  the  con¬ 
text  of  this  work.  [14]  analyzed  dependency  aware  inte¬ 
gration  of  Cyber-Physical  Systems  in  smart  homes.  [4] 
researched  how  risk  and  system  theory  apply  to  criti¬ 
cal  infrastructure  vulnerabilities  and  how  can  to  quantify 
them  applied  to  water  systems. 

[12]  proposed  CANDID,  which  is  a  framework  for  the 
classification  of  assets  in  networks  by  determining  their 
importance  and  dependencies.  Prior  work  includes  [5], 
who  proposed  a  methodology  for  modeling  complex  in¬ 
frastructures,  [16],  [3]  and  [1],  who  all  analyzed  and 
modeled  interdependence  in  critical  network  infrastruc¬ 
tures.  The  preceding  European  Union  research  project 
IRRIIS  ([?]),  which  is  an  acronym  for  Integrated  Risk 
Reduction  of  Information-based  Infrastructure  Systems, 
also  looked  into  reducing  risk  in  interdependent  critical 
network  infrastructures. 
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Abstract — Internet  of  Things  (IoT)  and  Cyber-Physical  Systems 
(CPS)  are  both  relatively  novel  networking  paradigms  integrating 
cyber  and  physical  worlds  of  strongly- networked  devices.  In  the 
IoT  and  CPS  realms,  the  devices  interact  with  the  physical  world 
through  their  sensors  and  actuators.  Indeed,  the  utilization  of 
sensors  and  actuators,  important  components  of  these  realms, 
have  been  around  for  a  long  time  in  industry  and  military 
settings.  For  instance,  sensors  are  utilized  in  numerous  military 
applications  due  to  their  low  cost  and  multiple  functionalities. 
For  different  military  units,  sensors  are  key  components  of 
any  modem  warfare.  Unmanned  aerial  vehicles  (UAVs)  navigate 
via  sensor  balls.  Acoustic,  magnetic,  and  pressure  sensors  are 
utilized  in  detecting  and  avoiding  underwater  mines.  Nonetheless, 
current  security  models  consider  protecting  only  networking 
components  of  the  CPS  and  IoT  devices  utilizing  traditional 
security  mechanisms  (e.g.,  an  intrusion  detection  system  for 
the  data  in  the  network  stack).  These  protection  mechanisms 
are  not  sufficient  to  protect  CPS  and  IoT  devices  from  threats 
directly  emanating  from  sensory  channels.  Using  sensory  channels 
(e.g.,  light,  temperature,  infrared,  acoustic),  an  adversary  can 
successfully  attack  military  CPS  and  IoT.  In  this  short  paper, 
we  discuss  the  sensory  channel  threats  to  military  CPS  and  IoT 
Devices. 

Index  Terms — Military  Communications  and  Information  Sys¬ 
tems,  Sensory-channel  threats  to  Military  assets,  Cyber-Physical 
Systems,  Intemet-of-Things 

I.  Introduction 

Cyber  space  is  expanding  fast  with  the  introduction  of 
new  Cyber-Physical  Systems  (CPS)  and  Internet  of  Things 
(IoT)  devices.  Today,  it  is  extremely  challenging  to  find  a 
CPS  and  IoT  device  without  any  networking  capability.  Smart 
watches,  thermostats,  glasses,  fitness  trackers,  medical  devices, 
Internet-connected  house  appliances,  and  vehicles  have  grown 
exponentially  in  a  short  period  of  time.  It  is  estimated  that 
on  average,  every  eighty  second,  one  device  is  assumed  to 
be  connected  to  Internet  today  and  our  everyday  lives  will  be 
dominated  by  billions  of  smart  connected  devices  by  the  end 
of  this  decade  [1]. 

In  a  similar  fashion,  the  U.S.  Department  of  Defense 
(DoD)’s  Global  Information  Grid  (GIG)  [2]  (Figure  1)  includes 
myriads  of  robustly  networked  intelligent  IoT,  CPS  devices, 
and  wearables  such  as  heads-up  display  (HUD)  glasses  [3] 
(Figure  2(a)),  bio-engineered  systems,  intelligent  sensors,  and 


Fig.  1.  Illustration  of  the  U.S.  Department  of  Defense  Global  Information 
Grid,  source:  [2]. 

autonomous  systems.  These  devices  are  utilized  in  many 
military  applications  and  support  a  hybrid  force  of  manned 
and  unmanned  combat  systems  in  their  critical  decisions  both 
at  peace  and  war  conditions.  For  instance,  UAVs  (Figure  2(b)) 
navigate  via  sensor  balls  and  armored  suits  used  by  the  military 
also  depend  on  a  number  of  different  environment-monitoring 
sensors  (e.g.,  optical,  acoustic,  seismic,  and  temperature)  [4]. 
Acoustic,  magnetic,  and  pressure  sensors  are  utilized  in  de¬ 
tecting  and  avoiding  underwater  mines.  Naval  weapon  systems 
(e.g.,  Aegis  Combat  System)  on  destroyers  work  with  remote 
sensors  to  intercept  targets  to  defend  beyond  line  of  sight  [2] 
in  Anti-Air  Warfare  (AAW).  Given  the  increasingly  critical 
nature  of  the  cyberspace  of  these  IoT  and  CPS  devices,  it  is 
imperative  that  they  are  secured.  An  adversary  only  needs  one 
entry  point  to  the  infiltrate  the  GIG. 

Nonetheless,  we  note  that  it  is  also  possible  to  exploit 
sensor-based  military  CPS  and  IoT  assets  (applications  and 
devices)  directly  via  their  sensory  components  [7],  [8].  For 
instance,  a  malicious  temperature  input  to  an  automated  sprin¬ 
kler  system’s  temperature  sensor  on  board  a  navy  vessel  (e.g., 
cruisers,  destroyers,  submarines)  can  cause  a  serious  damage  to 
the  safety  of  operations,  tasks,  and  personnel.  Similarly,  a  light 
sensor  normally  activated  by  a  certain  illuminance  value  can 
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(a)  (b)  (c) 

Fig.  2.  (a)  A  smart  glass  by  Vuzix  source:  [3];  (b)  Predator  drone  source:  [5];  (c)  Dragonfly-Micro-UAV  source:  [6]. 


easily  be  tricked  by  false  input  from  a  powerful  flashlight  of 
an  enemy  unit.  In  fact,  to  the  best  of  our  knowledge,  currently 
military  CPS  and  IoT  security  is  limited  to  protecting  the  CPS 
and  IoT  components  networked  via  traditional  means  (e.g., 
RF)  or  services  on  the  host  devices.  In  other  words,  securing  a 
networked  military  CPS  and  IoT  asset  means  utilizing  the  same 
tools  and  security  mechanisms  developed  for  the  RF  world. 
However,  sensory  components  in  CPS  and  IoT  devices  form 
sensory  channels  that  serve  as  external  interfaces  to  their  host 
systems.  Since  a  significant  number  of  critical  functionalities 
(Figure  2)  in  the  CPS  and  IoT  realms  are  realized  interacting 
with  the  real  world  through  these  sensory  channels,  securing 
the  sensory  channels  is  as  vital  as  securing  other  components 
of  military  CPS  and  IoT  assets.  Hence,  in  this  paper,  we  focus 
on  the  sensory  channel  threats  to  military  CPS  and  IoT  assets. 

II.  Sensory  Channel  Threats 

In  this  section,  we  describe  specific  ways  of  exploiting 
sensory  channels  to  perpetrate  malicious  activities  against  the 
military  CPS  and  IoT  assets. 

We  primarily  envision  four  different  ways  to  perpetrate 
malicious  activities  on  CPS/IoT  sensory  channels.  Using  these 
channels,  an  adversary  can  (1)  trigger  existing  malware,  (2) 
transfer  malware,  (3)  combine  multiple  channels  to  increase 
the  impact  of  a  threat,  or  (4)  leak  sensitive  information. 

In  the  first  threat ,  the  adversary  triggers  a  malicious  program 
existing  in  the  host  CPS  or  IoT  device  or  application  where 
the  sensory  channel  resides.  The  malicious  program  is  assumed 
to  be  loaded  into  the  system’s  hardware  or  software  without 
the  knowledge  of  its  owner  [9].  The  malicious  program  is 
activated  by  a  specific  value  or  sensory  pattern  received  over 
the  sensory  channels.  For  instance,  a  malicious  program  can 
be  triggered  over  an  accelerometer  to  capture  videos,  pictures 
surreptitiously. 

The  second  threat  involves  utilizing  sensory  channels  to 
deliver  a  certain  piece  of  malware  to  a  CPS  or  IoT  device 
or  application.  The  device  could  be  already  compromised 
or  not.  A  complete  malicious  code  segment  or  Trojan  can 
be  transmitted  by  the  attacker  through  the  sensory  channels. 
As  the  traditional  communication  channel  (e.g.,  RF)  remains 
unaffected  by  this  threat,  it  becomes  more  difficult  to  detect 
or  prevent  this  threat.  New  Trojans  can  also  be  transferred 
or  updated  remotely  to  the  compromised  CPS  or  IoT  device 
without  being  detected. 


In  the  third  threat ,  an  adversary  can  effectively  combine 
more  than  one  sensory  channel.  Today  most  of  the  CPS  or 
IoT  devices  are  manufactured  with  more  than  one  sensor. 
For  instance,  a  military  armored  suit  utilize  a  number  of 
different  environment-monitoring  sensors  (e.g.,  optical,  acous¬ 
tic,  seismic,  temperature).  Hence,  a  plausible  and  a  more 
complicated  possible  scenario  we  envision  in  the  third  threat  is 
the  combination  of  more  than  one  sensory  channel  to  increase 
the  impact  of  one  channel.  In  this  case,  an  adversary  can 
combine  the  sensory  channels  to  increase  the  effective  rate 
that  can  be  achieved  while  delivering  malware.  Furthermore, 
an  adversary  can  bundle  the  traditional  communication  channel 
with  the  sensory  channels  to  increase  the  impact  of  the  damage. 

Finally,  in  the  fourth  threat ,  an  adversary  may  passively 
observe  the  sensitive  information  leaked  through  the  sensory 
channels  with  or  without  intention. 

III.  Conclusion 

In  this  short  position  paper,  we  focused  on  threats  to  military 
CPS  and  IoT  assets  through  their  sensory  channels.  The  sen¬ 
sors  on  host  CPS  and  IoT  devices  and  applications  effectively 
form  the  sensory  channels.  We  specifically  articulated  how  a 
malicious  entity  can  target  military  CPS  and  IoT  with  four 
different  methods  exploiting  the  sensory  channels. 
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Abstract — Since  all  Department  of  Defense  (DoD)  missions 
depend  on  cyber  assets  and  capabilities,  a  dynamic  and  ac¬ 
curate  cyber  dependency  analysis  is  a  critical  component  of 
mission  assurance.  Mission  analysis  aims  to  identify  hosts  and 
applications  that  are  “mission  critical”  so  they  can  be  moni¬ 
tored,  and  resources  preferentially  allocated  to  mitigate  risks. 
For  missions  limited  in  duration  and  scale  (tactical  missions), 
dependency  analysis  is  possible  to  conceptualize  in  principle, 
although  currently  difficult  to  realize  in  practice.  However,  for 
missions  of  long  duration  and  large  scale  (strategic  missions),  the 
situation  is  murkier.  In  particular,  cyber  researchers  struggle 
to  find  technologies  that  will  scale  up  to  large  numbers  of 
hosts  and  applications,  since  a  typical  strategic  DoD  mission 
might  expect  to  leverage  a  large  enterprise  network.  In  this 
position  paper,  we  argue  that  the  difficulty  is  fundamental: 
as  the  mission  timescale  becomes  longer  and  longer,  and  the 
number  of  hosts  associated  with  the  mission  becomes  larger  and 
larger,  the  mission  encompasses  the  entire  network,  and  mission 
defense  becomes  indistinguishable  from  classic  network  defense. 
Concepts  generally  associated  with  mission  assurance,  such  as 
fight-through,  are  not  well  suited  to  these  long  timescales  and 
large  networks.  This  train  of  thought  leads  us  to  reconsider 
the  concept  of  “scalability”  as  it  applies  to  mission  assurance, 
and  suggest  that  a  hierarchical  abstraction  approach  be  applied. 
Large-scale,  long  duration  mission  assurance  may  be  treated 
as  the  interaction  of  many  small-scale,  short  duration  tactical 
missions. 

I.  Introduction 

The  Department  of  Defense  (DoD)  recognizes  that  all 
defense  missions  today  depend  on  cyber  infrastructure.  The 
2010  Quadrennial  Defense  Review  finds  that  [7]  “A  failure 
by  the  Department  to  secure  its  systems  in  cyberspace  would 
pose  a  fundamental  risk  to  our  ability  to  accomplish  defense 
missions  today  and  in  the  future.”  The  role  of  cyber  depen¬ 
dencies  in  providing  mission  assurance  has  inspired  multiple 
studies  and  technology  development  efforts  [2],  [4],  [5],  [6], 
[8],  [9].  The  DoD  must  be  able  to  guarantee  that  it  can  continue 
accomplishing  critical  missions,  even  in  the  face  of  degraded 
or  disabled  cyber  infrastructure.  Identifying  the  Cyber  Key 
Terrain  (C-KT),  i.e.  those  cyber  assets  necessarily  used  in 
mission  execution,  is  a  vital  ingredient  needed  to  provide  such 
guaranteed  mission  assurance.  1 

There  is  a  divide  in  the  literature  regarding  the  best  strategy 
for  identifying  the  C-KT  of  a  particular  mission,  and  mapping 
out  its  network  dependencies.  Methodologies  tend  to  fall  in  one 
of  two  basic  classes:  process  driven  mapping  and  artifact  driven 
mapping.  Process  driven  mapping  makes  heavy  use  of  subject 


'The  DoD  definition  of  key  terrain  in  general  is  “Any  locality,  or  area,  the 
seizure  or  retention  of  which  affords  a  marked  advantage  to  either  combatant.” 
The  cyber  version  of  this  would  also  include  assets  that  enable  the  adversary 
to  execute  its  mission  against  the  U.S.  For  the  purposes  of  this  paper,  however, 
we  have  adopted  a  more  restrictive  definition  focused  on  mission  assurance. 


matter  experts,  and  is  typically  manual  and  time  consuming. 
Artifact  driven  mapping  leverages  usage  data  and  lends  itself 
more  readily  to  automation,  but  the  data  frequently  lacks 
sufficient  context  to  reliably  identify  the  C-KT.  The  proponents 
of  both  methodologies  are  concerned  with  the  ability  to  scale 
up  to  enterprise-scale  networks.  Of  particular  concern  is  that 
dependency  maps  of  large  numbers  of  hosts,  over  very  long 
timescales,  tend  to  be  difficult  to  convey  succinctly.  They  often 
produce  a  deluge  of  data  which  suffers  from  the  “hairball” 
problem  when  visually  represented. 

DoD  missions  can  exist  at  the  strategic,  operational  or 
tactical  levels.  In  general,  strategic  and  operational  missions 
are  conducted  over  longer  timescales,  and  are  much  broader 
in  scope  than  tactical  missions.  In  this  position  paper  we 
explore  the  hypothesis  that  strategic  and  operational  missions 
are  dependent  on,  and  to  a  great  degree  comprised  of,  sub¬ 
missions  conducted  at  the  tactical  level.  Thus  effective  mission 
assurance  at  every  layer  of  the  hierarchy  depends  on  the  ability 
to  map  tactical  level  missions  with  fidelity.  This  is  particularly 
pertinent  to  the  cyber  domain.  It  may  be  misguided  to  focus 
entirely  on  techniques  and  visualizations  that  scale  up  to  enter¬ 
prise  network  scale,  or  are  capable  of  processing  data  volumes 
from  extended  periods  of  time.  Indeed  such  techniques  may 
over-aggregate  and  not  provide  sufficient  situational  awareness 
to  identify  and  mitigate  risks  to  the  mission. 

Although  this  discussion  takes  place  in  the  context  of  the 
DoD,  all  of  the  conclusions  can  be  generalized  to  apply  to  the 
civilian  arena.  All  large  organizations  include  missions  that 
can  be  described  as  tactical,  operational  and  strategic, 

II.  Timescale 

Strategic,  operational  and  tactical  missions  are  conducted 
over  distinct  characteristic  times.  Strategic  missions  capture  the 
essential  role  of  the  organization;  e.g.  the  mission  of  DoD  is  to 
provide  the  military  forces  needed  to  deter  war  and  to  protect 
the  security  of  our  country  [1].  Because  of  this,  strategic 
missions  are  executed  continuously  rather  than  on  a  short 
timescale,  and  the  mission  definition  evolves  very  slowly  if 
at  all.  In  contrast,  tactical  missions  tend  to  comprise  a  specific 
set  of  military  actions  with  a  well  defined  goal  that  is  easily 
measured;  e.g.  conduct  an  airstrike  against  a  particular  target. 
The  duration  of  tactical  missions  is  generally  short,  although 
the  mission  can  be  repeated  multiple  times.  Tactical  missions 
are  defined  and  executed  based  on  specific  military  actions 
that  need  to  be  taken,  so  the  mission  definitions  are  variable 
and  are  often  not  known  in  advance.  Finally,  operational 
missions  involve  resource  allocation  and  the  integration  of 
tactical  missions  to  achieve  strategic  ends  [3].  The  timescales 
for  operational  missions  are  generally  long,  but  the  mission 
definition  may  evolve  more  swiftly  than  that  of  a  strategic 
mission. 
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(time  necessary  to  assure  mission  success) 

Fig.  1.  Figure  L  Tactical  mission  assurance  involves  relatively  few 
hosts  and  short  timescales;  enterprise  defense  involves  many  hosts  and 
long  to  indefinite  timescales. 


According  to  USAF  doctrine,  while  the  resulting  effects 
may  be  described  as  operational  or  strategic,  military  actions 
occur  almost  entirely  at  the  tactical  level  [3].  This  is  partic¬ 
ularly  true  in  cyberspace.  While  cyber  assets  are  frequently 
used  to  provide  information  and  command  and  control  in 
support  of  strategic  and  operational  missions,  the  delivery  of 
information,  key  applications,  services,  and  command  channels 
occurs  entirely  at  the  tactical  level.  This  fact  generates  an 
argument  for  shifting  the  focus  of  dependency  mapping  efforts 
to  providing  mission  assurance  at  the  tactical  level. 

Another  reason  to  make  this  shift  is  that  certain  central 
elements  of  mission  assurance  are  easier  to  define  and  measure 
at  the  tactical  level  than  at  the  operational  or  strategic  levels. 
The  ability  to  “fight  through”  a  contested  cyberspace  is  a 
concept  that  only  applies  for  missions  of  finite  duration;  one 
cannot  fight  through  to  infinity. 

III.  Number  of  Hosts 

Strategic  and  operational  missions  use  a  larger  fraction 
of  the  total  hosts  on  the  network  than  tactical  missions. 
Indeed,  an  enterprise  network  exists  to  serve  the  strategic 
missions  of  the  organization.  In  contrast,  tactical  missions 
are  generally  supported  by  a  small  fraction  of  total  network. 
Good  network  hygiene  dictates  that  if  a  host  is  not  supporting 
any  organizational  mission,  it  needlessly  presents  extra  attack 
surface  to  adversaries  and  should  be  removed.  But  network 
hygiene  is  distinct  from  mission  dependency  mapping;  the 
central  aim  of  mission  dependency  mapping  is  to  identify  a 
restricted  set  of  hosts  (as  a  fraction  of  total  network)  critical 
to  a  particular  effort.  If  the  number  of  hosts  necessary  to  pros¬ 
ecute  a  mission  approaches  the  size  of  the  network,  mission 
defense  is  indistinguishable  from  classic  network  defense.  In 
practice,  number  of  hosts  and  timescale  (discussed  above)  are 
correlated,  depicted  schematically  in  Figure  1. 

Another  central  element  of  effective  mission  assurance  at 
the  operational  and  strategic  levels  incorporates  well  defined 


Courses  of  Action  (COAs)  designed  to  help  decision  makers 
react  to  evolving  priorities  and  risks.  Mapping  the  COA 
dependencies  independently  is  critical.  In  a  contested  cyber 
environment  one  cannot  defend  every  asset.  Limited  resources 
need  to  be  allocated  to  defend  highest  priority  cyber  terrain, 
based  on  tactical  decisions  regarding  which  COA  is  being 
pursued  in  support  of  the  operational  or  strategic  mission. 

IV.  Discussion 

The  import  of  the  arguments  presented  here  is  that  mission 
assurance  software  need  not  “scale”  to  the  size  of  a  global 
enterprise,  as  the  term  scaling  is  usually  defined.  Visualizations 
and  algorithms  need  not  work  for  thousands  of  hosts.  If 
thousands  of  hosts  are  present  in  the  dependency  map  of  an 
operational  or  strategic  mission,  with  little  or  no  fidelity  in  the 
mapping  of  the  tactical  importance  of  these  hosts,  it  will  be 
difficult  for  mission  defenders  to  know  which  hosts  should 
be  monitored.  Such  a  dependency  map  will  not  help  them 
correctly  prioritize  the  allocation  of  resources,  rather  it  will 
be  an  illegible  hairball,  and  be  ignored. 

Enterprise  scale  mission  assurance  is  instead  achieved 
by  hierarchical  decomposition  into  tactical  missions,  each 
associated  with  a  particular  COA.  It  is  important  to  explore 
the  validity  of  modeling  strategic  or  operational  missions  as 
entirely  composed  of  missions  at  the  tactical  level,  with  the 
overarching  mission  being  decomposed  into  sub-missions,  and 
sub-missions  decomposed  into  sub-sub-missions,  and  so  forth. 
At  each  mission  level,  as  much  detail  as  possible  of  the  level 
below  is  abstracted  away,  leaving  only  those  details  which 
are  necessary  to  maintain  fidelity  of  mission  interactions.  In 
this  manner,  the  problems  associated  with  scaling  and  data 
deluge  are  minimized.  However,  effective  models  of  mission 
assurance  for  operational  and  strategic  missions  will  necessar¬ 
ily  involve  retaining  enough  fidelity  to  capture  the  complex 
interactions  possible  between  multiple  tactical  building  blocks 
[10].  Determining  the  minimum  necessary  level  of  fidelity  is 
an  important  area  for  future  investigation. 

V.  Conclusions  and  Recommendations 

In  summary,  we  are  asserting  two  major  propositions.  First, 
in  the  cyber  domain,  crucial  mission  assurance  constructs  such 
as  cyber  key  terrain  and  fight-through  are  meaningful  for 
tactical  missions  involving  limited  time  and  a  small  fraction 
total  network  resources,  but  cease  to  be  useful  for  enduring 
missions  which  utilize  a  large  fraction  of  cyber  resources. 
In  the  latter  case,  mission  assurance  simply  degenerates  into 
classic  network  defense  and  network  hygiene.  Second,  that 
large,  enduring,  strategic  missions  may  in  fact  be  decomposed 
hierarchically  into  many  small  tactical  ones,  and  that  by  so 
doing  problems  of  scaling,  data  deluge,  and  visualization 
(the  hairball  problem)  are  minimized  by  dropping  complexity 
between  layers  of  the  hierarchy.  The  outstanding  problem 
becomes  determining  the  minimum  fidelity  necessary  in  the 
dependency  mapping  of  tactical  missions  and  sub-missions 
to  maintain  accurate  models  of  complex  system  interactions 
between  the  tactical  building  blocks.  Our  recommendations 
are  to  focus  near  term  efforts  on  developing  technology  for 
the  swift  and  accurate  mapping  of  tactical  missions,  with  a 
longer  term  focus  on  modeling  their  complex  interactions  to 
assure  larger  scale  missions. 
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ABSTRACT 

We  apply  a  contest-game  theoretic  framework  for  modelling  the  economic  impact  of  a  cyber-attack.  In  the  model 
the  attacker/defender  allocate  available  resources  and  efforts  to  maximize  gain/minimize  loss  from  the  attack. 
Among  its  useful  features ,  the  parsimonious  model  allows  for  the  assessment  of  the  asymmetry  in  the 
effectiveness  of  the  resource  use,  different  scale  for  gains  and  losses,  and  the  non-zero  probability  of  the 
unknown  vulnerability  to  be  exploited  in  the  attack.  A  Nash  solution  in  pure  strategies  is  demonstrated  and 
analysed. 


1.0  INTRODUCTION 

Governmental  security  systems  face  cyber-threats  from  many  sources,  including  solo  hackers,  criminal 
organizations,  and  intelligence  services  of  other  nations.  In  this  research,  we  focus  on  the  government-to- 
govemment  rivalries.  A  model  is  developed  that  shows  the  economically  efficient  level  of  resources  a  defending 
nation  should  commit  to  its  cyber  security  system  given  the  expected  economic  loss  associated  with  a  successful 
attack,  and  the  counter  actions  of  a  rival.  This  research  is  effectively  a  form  of  benefit  cost  analysis  conducted 
within  the  context  of  a  strategic  rivalry  between  countries. 

Although  game  theory  methods  have  been  used  in  the  general  field  of  terrorism  study  for  the  past  two  decades 
(See  Sandler  and  Sequeira,  2009),  the  application  to  cyber  security  policymaking  has  been  relatively  recent  (See 
Roy  et  al  2010  and  Lazka  et  al  2014  for  reviews).  Our  contribution  is  to  show  the  way  economically  efficient 
defensive  investments  should  reflect  the  gains  and  losses  of  a  successful  attack,  and  the  parameter  values  of  the 
functional  form  relating  resource  commitments  in  defence  and  attack  to  outcome  probabilities. 

We  begin  in  the  next  section  with  a  discussion  of  the  modelling  motivation  and  formulation.  The  following 
section  solves  the  model,  and  the  next  two  sections  detail  the  economically  rational  level  of  resource 
commitments  of  both  the  attacking  and  the  defending  nation.  The  following  section  discusses  the  possibility  of 
decision-making  noise  that  affects  the  equilibrium  solutions  for  the  model.  The  final  section  offers  some 
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concluding  remarks  and  suggestions  for  additional  research. 


2.0  THE  MODEL 

We  assume  a  rivalry  between  the  intelligence  services  of  two  countries  in  which  a  defender  country  faces  a 
cyber-attack  from  the  rival.  The  rivals  are  relatively  symmetrically  positioned  in  terms  of  level  of  technology  and 
available  resources,  and  the  probability  of  a  successful  attack  is  reasonably  high.  This  context  contrasts  with  the 
asymmetric  threats  with  low  probability  outcomes,  such  as  those  posed  by  terrorists  organizations  attempting  to 
launch  an  attack  on  the  territory  of  a  well  defended  country. 

In  the  benchmark  model  developed  in  this  paper,  it  is  assumed  that  both  intelligence  services  are  rational  -  as 
rationality  is  defined  in  conventional  economic  models  -  and  that  the  rivals  are  informed  about  each  other’s 
actions.  The  rationality  assumption  is  reconsidered  below.  In  this  version  of  the  paper,  we  also  make  the 
simplifying  assumption  that  the  game  is  not  repeated. 

Specifically,  we  assume  a  one-stage,  simultaneous  move  game  in  which  the  one  country  (“the  attacker”)  attacks 
the  cyber-infrastructure  of  the  other  country  (“the  defender”).  The  probability  that  the  attack  succeeds  reflects 
the  resources  committed  by  both  the  attacker  and  the  defender.  The  goal  of  the  attacker  is  to  maximize  their 
expected  gains  from  attack,  while  the  objective  of  a  defender  is  to  minimize  their  expected  losses.  The  solution 
concept  is  Nash  equilibrium.  Following  Krutilla  and  Alexeev  (2012)  the  model  is  represented  as  follows: 

maxPA  =  GAp(RA,RD )-  RA,  (1) 

KA 

min  PD  =  LDp  (Ra  ,  RD  )  +  RD ,  (2) 

rd 

The  variable  RA  and  RD  are  the  resource  commitment  by  the  attacker  and  the  defender  respectively,  and  PA 
and  PD  are  the  expected  net-pay  offs  of  the  two  revivals.  The  monetized  value  of  the  attacker’s  gain  GA  and 
defender’s  loss  LD  are  both  exogenous  variables.  GA  can  be  thought  of  as  the  monetized  utility  value  that  the 
attacker  derives  from  the  damage  they  create;  LD  is  the  monetized  utility  value  of  the  damage  to  the  defender. 
Without  loss  of  significant  generality,  we  express  the  relationship  between  GA  and  LD  as  GA  =  gLD,  with 
0  £  g  .The  g  parameter  can  be  seen  as  an  attacker’s  unit  valuation  of  a  dollar  of  damage  they  create. 


The  term  p  (Ra,Rd  )  in  equations  (l)-(2)  denotes  an  attack  success  function  that  gives  the  probability 
of  the  attack’s  success  as  a  function  of  the  attacker’s  and  defender’s  resource  commitments.  The  functional 
form  used  for  p(RA,RD}  is  based  on  modified  contest  success  function  commonly  used  to  model  rent- 
seeking  contests  in  the  political  economy  literature  (See  Tullock  1980  and  Glachant  2005): 


p(rA’Rd)=  Po  + 


R 


r 

A 


R \ 


aR\ 


(3) 


The  term p0,  0  <  p0  £  1 ,  represents  the  exogenous  probability  of  a  successful  attack.  A  non-zero 
initial  probability  may  exist  due  to  exogenous  technical  change  in  the  form  of  new  information  about  a 
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system’s  vulnerability  that  an  attacker  can  exploit  with  virtually  no  investment.  The  necessary  domain 
restriction  is:  p0  £  p(RA,RD)£  1  .'  Turning  to  the  other  parameters,  a  I  (0,¥  )  represents  asymmetries  in  the 
relative  effectiveness  of  rivals’  resource  commitments.  The  range  a<  limply  relatively  greater  effectiveness 

of  the  defender’s  resource  commitments,  while  a  >  1  implies  relatively  greater  effectiveness  for  the  attacker. 
The  r  parameter,2  with  r  I  (0,¥  ) ,  represents  the  returns  to  attacking  and  defending  resource  commitments. 
Increasing  its  value  gives  relatively  more  weight  in  the  outcome  probability  to  whichever  rival  devotes  more 
resources  to  the  contest. 


3.0  RESULTS 


Substituting  (3)  into  (1)  and  (2)  and  solving  for  (R*A,R*D)  gives  candidates  for  Nash  equilibria  in  pure 
strategies.  The  solutions  turn  out  to  be: 


(8r  +  aj 


(V  +  a  )2 


(4) 

(5) 


The  new  left-hand  side  variables  are  r*  °  RA  /  LD  \  r*  °  R^  /  LD.  That  is,  the  left-hand  side  variables  give 

the  ratio  of  each  of  the  rivals  resource  commitments  to  the  defenders  damage  loss.  The  right-hand  side  gives 
all  of  the  exogenous  parameters  in  the  model.  Figure  1  below  illustrates  the  optimal  resource  commitments 
as  a  function  of  the  models  parameters. 

Note  that  if  (5)  is  divided  by  (4),  the  result  is  the  simple  expression: 


K  =  I 
K  8 


(6) 


This  implies  that  the  ratio  of  efforts  and  resources  devoted  by  the  defender  to  that  by  attacker  is  inverse 
proportional  to  g  —  again,  the  unit  valuation  by  the  attacker  of  a  dollar  of  damage  to  the  defender  —  whatever 

the  absolute  gains,  GA  or  loss  LD  associated  with  a  successful  attack.  Given  the  assumption  that  0  £  g  ,  the 

defender’s  resource  commitments  (R^  )  will  always  be  greater  in  equilibrium  than  the  attacker’s  (RA  )  when 
the  attacker  underestimates  its  gain  (0  £  g<  l)  ,  and  vise  versa  when  the  damage  value  to  the  attacker  is 
overestimated  ( 1  £  g  )  . 


CV/  gg  y.  Q 

1  Formally,  (3)  must  be  written  as  p(RA,RD  )  =  min  Si ,p0  +  gl  +  a  (RD  / RA  )  T  =j=. 

I  *  0  0 

2  See  Baye  et  al  1994,  Nitzan  1994,  Perez-Castrillo  Verdier  1992. 
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Figure  1.  Optimal  Resource  Commitments,  as  a  Function  of  the  Model’s  Parameters  for  the  Defender 
(upper  cells)  and  for  the  Attacker  (bottom  cells) 

Substituting  (6)  into  (3)  gives  the  reduced  form  probability  for  the  success  of  the  attack: 
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p(r1’Rd)=  Po  + 


gr 

gr  +  a 


(7) 


The  probability  of  the  attack’s  success,  p(R^,R^)  increases  in  g  and  declines  in  a  —  again,  the  latter  is 

relative  effectiveness  of  resources  use  by  the  defender.  The  impact  of  r  on  p(R^,R^)  gives  rise  to  increase  of 

the  probability  with  r  given  a  fixed.  The  higher  return  to  attack  -  the  higher  probability  of  attack  success. 
Figure  2  below  shows  the  probability  of  successful  attack  as  a  function  of  the  models  parameters. 


Figure  2.  Probability  of  Successful  Attack  Curves,  as  a  Function  of  the  Model’s  Parameters 


4.0  RATIONALITY  ASSUMPTION  AND  DECISION  MAKING  NOISE 

Although  presumably  government  intelligence  services  should  be  acting  analytically,  there  are  a  number  of 
rationales  behind  relaxing  ubiquitous  economic  assumption  that  they  would  chooses  the  best  strategy  in  an 
optimizing  game  theoretic  framework.  Both  laboratory  experiments  and  empirical  observation  often  reveal 
deviation  from  the  strictly  rational  behaviour,  including  around  cyber-security  issue  (see,  e.g.  Bada  et  al 
(2015),  Yang  et  al  (2015)  among  others).  There  are  several  modelling  frameworks  for  the  bounded  rationality. 
All  the  approaches  assume  that  the  agent  choses  not  the  best  strategy  but  a  “reasonably  good”  strategy  that 
deviates  from  optimality  with  the  probability  declining  with  the  deviation  magnitude.  For  example,  Wall 
(1993)  develops  and  implements  dynamic  and  adaptive  models  that  combine  satisficing  behaviour  with 
learning  and  adaptation  through  environmental  feedback.  This  a  sequential  decision  making  with  one 
alternative  strategy  at  time,  with  search  strategies  based  on  learning  and  adaptation.  Quantal  Response 
Equilibrium  (QRE)  is  another  approach  (see,  e.g.  Sheremeta  (2015),  An  et  al  (2013)).  Another  framework  has 
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been  developed  by  Amigashi  (2006).  His  contest  success  function  originally  discovered  by  Dasgupta  and  Nti 
(1998)  adds  a  “noise  parameter”  in  the  decision  making  process.  In  cyber-security  applications,  it  might 
measure  false-positive  alarms  in  the  intrusion  detection  system.  In  this  context,  the  contest  success  function 
(shown  here  for  the  attacker)  has  the  following  form: 


p(RA’Rd)  = 


ra  +  1 

R'a  +  aRrD  +  21 


(8) 


where  /  is  the  noise  parameter,  and  a,RA,RD  are  defined  in  (3).  When,  X=0  ,  the  decisions  are  defined  by 

the  Nash  equilibria)  as  calculated  in  y  (4)-(5).  As  /  increases,  the  solutions  departs  farther  from  the  Nash 
equilibria,  and  becomes  less  sensitive  to  the  value  of  the  resources  invested  into  attack/defense.  As  l  ®  ¥ 
the  choice  of  strategy  is  absolutely  random,  that  is  p(RA,RD  )  =  0.5  regardless  of  what  actions  the  defending 

country  and  attacking  country  take.  The  solutions  of  the  contest  game  (1),  (2)  and  (8)  include  parameter  / 
and  can  be  expressed  as  following  : 


*  g(g  -  21  )a  -  a2l  -  g2l 

rA  =  - 2 - 

(a  +  .?) 

*  (g  -  l  )a2  -  2 agl  -  g2l 

>'d  =  - j - 

a  (a  +  g) 


(9) 


Figure  3  below  illustrates  the  optimal  resource  commitments  as  a  function  of  the  models  parameters  and  noise 
parameter  /  . 


Figure  3.  Optimal  resource  commitments,  as  a  Function  of  the  Noise  Parameter  for  the  defender  (left) 
and  for  the  attacker  (right) 
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Note  that  ratio  r*  /  rA  defined  in  (6)  takes  the  following  form: 

vd  _  rd  _  " 1  y2  ■ iagi  ■ g2{ 

rA  ra  -  2l)a  -  a2l  -  g2l  ) 


(10) 


At  2=0  the  decisions  (9)  are  defined  by  the  Nash  equilibria  as  calculated  in  (4)-(5),  and  ratio  r*  /  r*A  ®  g~  1 

defined  in  (6).  At  the  other  extreme,  at  l  ®  ¥  ,  the  ratio(lO)  of  the  resource  costs  is  ®  a  1 .  Recall, 

parameter  a  represents  asymmetries  in  the  relative  effectiveness  of  rivals’  resource  commitments,  and 
implies  relatively  greater  effectiveness  of  the  defender’s  resource  commitments  while  a<  1 ,  and  vice  versa 


while  a  >  1.  In  perfectly  noisy  environment  r*  /  r*  ®  a "  1  implies  that  the  ratio  of  efforts  and  resources 
devoted  by  the  defender  to  that  by  attacker  is  inverse  proportional  to  a  and  independent  from  g  .  In  other 
words,  the  ration  does  not  depend  on  either  absolute  values  of  gains,  GA  ,  or  loss,  LD  ,  associated  with  a 
successful  attack  nor  on  their  ratio.  Note,  in  (9)  and  (10),  noise  parameter,  /  ,  is  dimensionless  and  scaled  in 


<T  r* 

units  of  Ld  ;  r  -1  assumed  for  simplicity.  It  can  routinely  be  shown  that  both  <  0  and  <  0 .  This 


K 

V 


coincides  with  the  conventional  wisdom.  As  a  strategy  departs  from  optimality  due  to  bounded  rationality/ 
increasing  noise  in  the  system,  both  the  probability  of  the  attack’s  success  and  payoffs  become  less  and  less 
sensitive  to  the  resource  cost  allocated  for  the  attack/defense,  and,  consequently,  less  resources  is  required  for 
the  optimal  strategy. 


5.0  CONCLUSION  AND  FUTURE  RESEARCH 

We  have  used  a  contest-game  theoretic  framework  to  the  model  a  strategic  contest  between  the 
intelligence  services  of  two  countries,  where  the  one  country  attempts  to  penetrate  the  cyber  system  of  the 
other,  and  the  country  on  the  receiving  end  of  the  cyber-attack  attempts  to  defeat  the  attack.  The  attacker  and 
defender  allocate  resources  to  maximize  their  gains  and  minimize  their  losses,  respectively,  taking  into 
account  the  actions  of  each  other.  The  model  used  to  describe  this  interaction  represents  the  effects  of 
asymmetry  in  the  effectiveness  of  the  resource  commitments,  different  scales  for  gains  and  losses,  and  non¬ 
zero  probability  of  the  unknown  vulnerability  to  be  exploited  in  the  attack.  A  Nash  solution  in  pure  strategies 
is  used  to  evaluate  a  dependence  of  the  initial  probability  of  the  successful  attack  on  the  measure  of  the 
attack’s  detrimental  effect. 

Conceptually,  the  behaviour  of  intelligence  services  should  be  assessed  in  a  generalized  contest  model  that 
incorporates  a  measure  risk-averseness  of  the  attacker/defender,  bounded  rationality,  decomposition  of  the 
risk  attitude  into  systematic  and  idiosyncratic  components,  and  judgmental  bias  —  among  others.  Additionally, 
the  actors  in  the  contest  will  be  interacting  over  multiple  periods.  We  are  beginning  to  incorporate  these 
features  into  the  model  in  our  ongoing  research  program. 
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Modern  warfare  is  increasingly  dependent  upon  resources,  connections, 
and  interactions  in  the  domain  that  Gibson  famously  dubbed  cyberspace -* 1  It 
follows,  then,  that  the  deepening  reliance  on  electronic  data,  networks,  and 
computing  resources  presents  a  valuable  target  to  adversaries.  Therefore, 
combatant  commanders  should  seek  to  protect  their  Communications  and 
Information  Systems  (CIS)  from  corruption  and  denial  by  saboteurs  in  much 
the  same  way  that  protecting  lines  of  supply  and  communication  was  at  the 
forefront  of  every  general’s  mind  in  the  massive  land  campaigns  of  the  last 
century.2 

Conflict  is  the  result  of  competing,  imperfectly  informed  decision  mak¬ 
ers  applying  resources  against  targets.  This  is  equally  true  of  adversarial 
interactions  in  cyberspace,  where  decision  makers  can  be  human  or  artificial 
agents;  targets  include  data  and  platforms  used  by  CIS;  resources  include 
exploits,  credentials,  and  data  manipulation  scripts;  and  partial  observabil¬ 
ity  is  the  result  of  imperfect  and  limited  aperture  sensors.  To  better  prepare 
for,  detect,  and  respond  to  attacks  in  cyberspace,  we  must  seek  to  under¬ 
stand  not  only  what  an  adversary  might  do  in  this  space  but  also  how  an 

*  corresponding  author:  matthew.henry@jhuapl.edu,  (240)  228-2585 

:W.  Gibson,  Burning  Chrome,  Onmi.  July,  1982,  pp.  72-77,102-104. 

2 Thanks  to  my  colleague  Chuck  Crosse tt  for  articulating  this  useful  analogy. 
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attack  might  proceed  in  the  context  of  different  opportunities  and  obstacles 
presented  to  the  attacker. 

Opportunities  and  obstacles  can  be  passive  and  static,  in  which  case  they 
present  an  adversary  with  fixed  terrain,  or  they  can  be  responsive  and  dy¬ 
namic,  in  which  case  they  present  an  adversary  with  a  much  less  predictable 
landscape.  Greater  insight  into  the  alternatives  available  to  an  adversary 
and  the  associated  outcomes  of  different  adversary  choices  under  different 
circumstances  will  help  the  CIS  Defender  to  extrapolate  comprehensive  at¬ 
tack  awareness  from  sparse  detection  events  so  that  countermeasures  can  be 
more  effectively  deployed.  Moreover,  a  priori  understanding  of  how  attacks 
in  the  aggregate  might  proceed  in  the  context  of  different  types  of  oppor¬ 
tunities  and  obstacles  will  help  CIS  system  engineers  make  better-informed 
decisions  about  architectural  design  choices,  selection  and  placement  of  sen¬ 
sors,  implementation  of  intrusion  detection  and  prevention  safeguards,  and 
institution  of  operational  practices  and  training  programs. 

The  key  to  achieving  these  needed  insights  is  good  modeling  and  model- 
based  analysis.  Our  purpose  here  is  to  briefly  illustrate  the  relative  ben¬ 
efits  and  shortcomings  of  one  of  the  most  prevalent  and  useful  modeling 
paradigms  in  current  practice,  graph-based  analysis,  and  then  contrast  it 
with  a  new  approach  that  explicitly  accounts  for  adversary  decision  pro¬ 
cesses  and  the  effects  of  partially  observed  attack  state  spaces  when  modeling 
conflicts  in  cyberspace. 

Graph-based  modeling  techniques  compute  measures  of  cyber  attack 
state  reachability,  where  the  attack  state  is  typically  described  by  the  set  of 
resources  accessible  by  the  attacker  at  any  stage  of  the  attack.3  The  advan¬ 
tages  of  this  approach  include  a  relatively  manageable  data  requirement  for 
constructing  the  model,  repeatably  and  precisely  computable  outcomes,  and 
easily  interpretable  results.  Moreover,  because  these  techniques  focus  largely 
on  graph  traversal,  where  the  nodes  typically  represent  system  resources  and 
access  requirements,  finding  high-value  passive  security  enhancement  oppor¬ 
tunities  such  as  firewall  rules,  access  control  policies,  and  so  forth,  can  be 
straight-forward . 4 

In  graph-based  analysis,  consequence  measures  for  each  reachable  attack 
state  are  used  to  assess  risk.  These  measures  can  be  estimated  using  a  var 
riety  of  techniques,  including  consequence  state  reachability.  Consequence 
state  is  described  by  the  set  of  outcomes  that  can  be  induced  by  the  attacker 

3cf.  K.  Ingols  et  al.,  Modeling  Modern  Network  Attacks  and  Countermeasures  Using 
Attack  Graphs,  2009  Ann  Comp  Sec  App  Conf  (ACSACW).  Dec  7-11,  2009,  pp.117-126. 

4cf.  M.H.  Henry  et  al.,  Coupled  Petri  Nets  for  Computer  Network  Risk  Analysis,  Inti 
Journal  Critical  Infrastructure  Protection.  3(2),  2010,  pp.  67-75. 
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through  manipulation  of  CIS-supported  processes  via  access  and  control  au¬ 
thority  afforded  by  the  network  resources  accessible  in  the  corresponding 
attack  state.* * 5  Alternatively,  consequence  measures  can  be  estimated  us¬ 
ing  decomposition  techniques  that  provide  a  structure  for  assimilating  and 
aggregating  assessments  provided  by  subject  matter  experts  (SME).  This 
approach  decomposes  a  CIS  in  terms  of  data  flows,  network  resources,  and 
mission  activities  to  assert  mission  consequences  due  to  network  resource  or 
data  manipulation.6 

While  the  aforementioned  modeling  methods  provide  valuable  insight 
into  the  mechanics  and  potential  outcomes  of  cyber  attacks,  they  are  essen¬ 
tially  limited  to  understanding  attacks  under  passive  defenses.  As  such,  they 
provide  limited  insight  into  the  value  of  active  defensive  measures,  whether 
proactive  or  responsive,  for  the  purpose  of  informing  investments  along  these 
lines.  We  assert  that  active  defenses  are  critical,  particularly  in  light  of  the 
fact  that  legitimate  system  users  inevitably  provide  a  substantial  compo¬ 
nent  of  the  attack  surface  by  remaining  susceptible  to  social  engineering 
(e.g.,  spear  phishing)  and  other  deception-based  intrusion  methods.  Under 
these  conditions,  passive  defenses  axe  less  effective  since  activities  executed 
under  the  auspices  of  legitimate  credentials  generally  appear  to  be  benign 
by  passive  measures. 

While  researchers  and  practitioners  in  the  computer  network  defense 
community  generally  agree  with  this  assertion  on  an  intuitive  basis,  there  is 
little  agreement  on  how  best  to  invest  in  active  defenses.  Moreover,  there 
are  no  credible  mature  techniques,  other  than  SME  intuition,  to  assess  the 
value  of  different  investments  in  active  defensive  measures.  Finally,  there 
are  no  mature  tools,  other  than  clever  visualization  schemes,  that  provide 
deep  insight  into  how  attacks  are  playing  out  when  only  scarce  indicators 
are  available  to  inform  response  activities. 

We  are  developing  new  game  theoretic  methods  that  explicitly  account 
for  an  attacker’s  decision  process  in  the  context  of  active  defenses  and  partial 
information  so  that  system  architects  and  engineers  can  gain  insight  into  the 
value  of  these  defenses  and  their  associated  intrusion  detection  mechanisms 
for  the  purpose  of  informing  broader  security  investment  decisions.  More¬ 
over,  by  explicitly  modeling  a  partially  observed  state  space,  we  are  working 
toward  methods  to  help  defenders  infer  the  true  extent  of  an  attack  that  is 

6cf.  M.H.  Henry  et  al.,  Evaluating  the  Risk  of  Cyber  Attacks  on  SCADA  Systems 

via  Petri  Net  Analysis.  2009  IEEE  Conf  Tech  Homeland  Sec  HST’09.  May  11-12,  2009, 

pp. 607-614. 

6cf.  T.  Llanso  &  E.  Klatt,  CyMRisk:  An  Approach  for  Computing  Mission  Risk  due 
to  Cyber  Attacks,  8th  Ann  IEEE  Sys  Conf  (SysCon).  Mar  31- Apr  3,  2014,  pp.1-7. 
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underway  when  only  scarce  indicators  are  available  from  sensors. 

At  a  high  level,  our  approach  is  to  model  the  CIS  Attacker’s  decision 
process  as  a  partially  observed  stochastic  optimization  problem  in  the  con¬ 
text  of  opportunities  (e.g.,  access  to  a  host’s  resources)  and  obstacles  (e.g., 
credentials  needed  to  access  a  host’s  resources)  that  the  intruder  discovers 
and  responds  to  over  the  course  of  the  intrusion.  At  the  same  time,  the 
CIS  Defender  may  at  times  detect  indicators  of  an  intrusion  in  the  form  of 
host  infection  and  respond  by  introducing  additional  obstacles  (e.g.,  isolat¬ 
ing  and  reconstituting  an  infected  host).  As  such,  the  interaction  constitutes 
a  partially  observed  stochastic  game  with  finite  state  space.  This  work  com¬ 
plements  other  approaches  reported  in  the  academic  literature.7 

Two  research  problems  are  the  focus  of  our  current  work.  The  first  is  to 
develop  efficient  computational  methods  for  identifying  near  optimal  strate¬ 
gies  when  presented  with  partially  observed  state  spaces.  The  necessary 
estimation  of  belief  measures  on  the  true  state  quickly  overwhelm  standard 
algorithmic  approaches  based  on  policy  iteration.  Techniques  borrowed  from 
the  artificial  intelligence  community  are  proving  useful  as  means  to  approxi¬ 
mate  optimal  strategies  under  partial  information.  The  second  is  to  develop 
reliable  parameter  estimation  techniques  from  intrusion  data,  vulnerability 
databases,  and  intrusion  detection  data.  A  forthcoming  Springer  volume, 
expected  later  this  year,  will  include  a  more  in-depth  discussion  of  our  model- 
based  analysis  work,  its  associated  research  challenges,  and  the  anticipated 
benefits  for  improving  both  strategic  risk  assessment  and  tactical  situational 
awareness. 

In  spite  of  the  challenges,  we  are  confident  that  approaches  such  as  the 
one  we  are  pursuing  will  yield  the  insights  needed  to  better  understand,  pre¬ 
pare  for,  detect,  and  respond  to  conflicts  in  cyberspace.  Moreover,  we  assert 
that  analytic  techniques  that  account  for  adversary  decision  processes  are 
necessary  to  inform  strategic  and  tactical  decision-making  when  the  adver¬ 
sary’s  intentions,  maneuvers,  and  disposition  are  only  partially,  and  perhaps 
imperfectly  known.  This  has  been  the  traditional  approach  to  strategy  de¬ 
velopment  in  other  conflict  domains,  and  it  applies  equally  to  cyberspace. 


7cf.  S.A.  Zonouz  et  al.,  RRE:  A  Game- Theoretic  Intrusion  Response  and  Recovery 
Engine.  2009  Inti  Conf  Dep  Sys  &  Nets  (DSN09).  Jun  29-Jul  2,  2009,  pp.439-448. 
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Abstract  —  This  paper  describes  AMICA  (Analyzing  Mission 
Impacts  of  Cyber  Actions),  an  integrated  approach  for 
understanding  mission  impacts  of  cyber  attacks.  AMICA 
combines  process  modeling,  discrete-event  simulation,  graph- 
based  dependency  modeling,  and  dynamic  visualizations.  This  is 
a  novel  convergence  of  two  lines  of  research:  process 
modeling/simulation  and  attack  graphs.  AMICA  captures  process 
flows  for  mission  tasks  as  well  as  cyber  attacker  and  defender 
tactics,  techniques,  and  procedures  (TTPs).  Vulnerability 
dependency  graphs  map  network  attack  paths,  and  mission- 
dependency  graphs  define  the  hierarchy  of  high-to-low-level 
mission  requirements  mapped  to  cyber  assets.  Through 
simulation  of  the  resulting  integrated  model,  we  quantify  impacts 
in  terms  of  mission-based  measures,  for  various  mission  and  threat 
scenarios.  Dynamic  visualization  of  simulation  runs  provides 
deeper  understanding  of  cyber  warfare  dynamics,  for  situational 
awareness  in  the  context  of  simulated  conflicts.  We  demonstrate 
our  approach  through  a  prototype  tool  that  combines  operational 
and  systems  views  for  rapid  analysis. 


Keywords  -  modeling  and  simulation;  mission  assurance; 
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I.  Introduction 


In  the  U.S.  Department  of  Defense  (DoD)  roadmap  for  cyber 
modeling  &  simulation  (M&S),  planning  for  integrated  cyber 
and  kinetic  mission  assurance  is  a  key  capability  area  [1].  The 
range  of  capabilities  called  out  in  the  roadmap  underscores  the 
urgent  need  for  rapid  progress  in  this  area,  especially  given  the 
asymmetric  nature  of  cyber  conflict. 
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Figure  1.  Spectrum  of  Cyber  M&S  Applications  and  Challenges 


Of  particular  importance  is  the  integration  of  kinetic 
operations  with  the  defensive  cyber  operations  that  support 
them.  This  requires  effective  communication  of  cyber 
situations  (and  their  big-picture  impacts)  to  decision  makers.  In 
addition,  there  are  numerous  potential  applications  of  cyber 
M&S,  along  a  spectrum  of  increased  maturity  and 
corresponding  research  challenges,  as  shown  in  Figure  1. 

Understanding  mission  resilience  to  cyber  warfare  requires 
bringing  together  layers  of  information  from  numerous  sources. 
At  the  lower  layers,  network  topology,  firewall  policies, 
intrusion  detection  systems,  system  configurations, 
vulnerabilities,  etc.,  all  play  a  part.  We  can  combine  these  into 
a  higher-level  attack  graph  model  that  shows  transitive  paths  of 
vulnerability.  We  also  need  to  map  cyber  assets  to  mission 
requirements,  and  capture  dependencies  from  low-level 
requirements  to  higher-level  ones  appropriate  for  decision 
making.  Because  mission  requirements  are  highly  dynamic,  we 
need  to  capture  time-dependent  models  of  mission  flow.  Cyber 
attacks  and  defenses  are  similarly  dynamic,  and  defenses 
generally  vary  depending  on  particular  attack  classes. 

We  introduce  an  approach  that  addresses  all  these  aspects  of 
mission-oriented  cyber  resilience,  through  an  integrated  M&S 
environment.  This  approach  is  called  Analyzing  Mission 
Impacts  of  Cyber  Actions  (AMICA).  AMICA  supports 
exploration  and  experimentation  of  the  mission  impacts  of 
cyber  warfare.  The  goal  is  to  develop  a  flexible,  extensible, 
modular,  multi-layer  M&S  system  for  quantitative  assessment 
of  operational  impacts  of  cyber  attacks  on  mission  performance. 
AMICA  is  expected  to  increase  our  understanding  of 
dependencies  between  operational  missions,  cyber  TTPs,  and 
computing  infrastructure. 

II.  Previous  Work 

There  have  been  numerous  information-centric  military 
exercises  with  aspects  of  mission  assurance  and  cyber  warfare. 
In  many  exercises  (e.g.,  Global  Thunder  [2]  and  Turbo 
Challenge  [3]),  cyber  security  is  an  important  component,  but 
not  the  primary  exercise  focus.  More  cyber-focused  exercises 
such  as  Cyber  Flag  [4]  have  integrated  cyber  activities  with 
operational  missions  for  training  purposes. 
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M&S  has  been  applied  in  more  traditional  military  spheres, 
e.g.,  for  inferring  enemy  intent  [5],  entity-based  battlefield 
simulations  [6],  and  command  decision  support  [7].  However, 
military  mission  planning  has  yet  to  leverage  M&S  and  other 
formal  methods  as  part  of  its  standard  practice,  especially  in  the 
area  of  developing  cyber  defensive  courses  of  action.  In  short, 
tools  such  as  AMICA  for  assessing  mission  impact  of  cyber 
warfare  are  generally  unavailable  for  operations-level  support. 
The  defense  community  is  aggressively  accelerating  cyber 
defense  forces  [8],  further  motivating  the  need  for  more 
advanced  capabilities  in  cyber  course-of-action  planning. 

In  the  cyber  domain,  M&S  capabilities  are  still  relatively 
immature.  Still,  previous  work  can  be  leveraged  for  certain 
components  of  an  integrated  overall  M&S  approach.  Systems 
such  as  Topological  Vulnerability  Analysis  (TV A)  [9]  [10], 
Network  Security  Planning  Architecture  (NetSPA)  [11],  and 
NRL’s  ACCEPT  (A  Configurable  Cyber  Event  Prioritization 
Tool)  [12]  fuse  network  data  (topology,  firewall  rules,  asset 
inventories,  vulnerability  scans/databases,  intrusion  alerts,  etc.) 
into  graph-based  models  for  mapping  vulnerability  paths  and 
prioritizing  events.  Capabilities  such  as  MITRE’s  Cyber 
Command  System  (CyCS)  [13]  and  Cyber  Mission  Impact 
Assessment  (CMIA)  [14],  and  AFRL’s  Cyber  Mission 
Assurance  [15]  capture  mission  and  cyber  dependencies. 

Another  key  enabler  for  cyber  M&S  is  standardization 
efforts.  Making  Security  Measurable™  [16]  is  a  collection  of 
standardization  activities  within  the  cyber  security  community. 
It  includes  Common  Vulnerabilities  and  Exposures  (CVE), 
Common  Attack  Pattern  Enumeration  and  Classification 
(CAPEC),  Cyber  Observable  Expression  (CybOX),  Structured 
Threat  Information  Expression  (STIX),  and  many  others.  These 
standards  cover  different  aspects  of  security  data  needed  for 
building  comprehensive  and  accurate  models. 

To  capture  the  flow  of  mission  and  cyber  processes,  we 
leverage  the  Object  Management  Group  (OMG)  Business 
Process  Model  Notation  (BPMN)  [17]  standard.  We  employ  the 
commercial  tool  iGrafx  [18],  which  extends  BPMN  with 
behavioral  modeling,  critical-path  analysis,  discrete-event 
simulation,  Monte  Carlo  analysis,  and  experiment  design. 

III.  Approach 

To  explore  the  AMICA  approach,  we  are  conducting  a  pilot 
study  and  developing  a  proof-of-concept  system.  We  seek  a 
flexible,  extensible,  modular,  and  multi-layer  M&S  environment 
for  quantitative  assessment  of  operational  impacts  of  cyber 
attacks  on  specific  missions,  as  shown  in  Figure  2.  Thus 
components  can  be  interchanged,  e.g.,  multiple  missions  on  an 
infrastructure,  to  support  analysis  of  different  questions. 

AMICA  currently  includes  libraries  for  operational  (kinetic) 
missions,  computing  infrastructure  on  which  missions  depend, 
cyber  attacker  TTPs,  and  cyber  defensive  TTPs.  Calibration  and 
validation  of  the  model  occurs  in  concert  with  mission 
commanders,  operators,  and  cyber  defenders.  In  essence,  we  are 
connecting  cyber  effects  to  the  kinetic  domain,  in  the  context  of 
highly  dynamic  cyber  warfare  and  mission  threads.  This  helps 
commanders  better  maintain  mission  effectiveness  in  a  force-on- 
force  cyber-contested  environment,  and  align  defenses  for  best 
operational  effect  across  a  campaign. 
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Figure  2.  Modular  Libraries  for  Model  Components 


For  mission  analysts  and  commanders,  we  seek  to  answer 
questions  such  as  the  following: 

•  When  and  where  would  be  the  most  damaging  attacks 
against  the  mission? 

•  How  long  before  a  particular  attack  has  significant 
mission  impact? 

•  How  long  does  it  take  a  mission  to  recover  from  an 
attack? 

•  What  is  more  damaging  to  the  mission:  loss  of  reach- 
back  availability  or  degradation  of  system  assets? 

For  cyber  defenders  and  analysts,  we  consider  questions  such  as 
the  following: 

•  What  is  the  impact  of  better  sensor  performance,  sensor 
location,  etc.? 

•  How  does  a  change  to  the  network  topology  affect 
security  posture? 

•  How  well  does  the  defense  perform  against  different 
tiers  of  attacker? 

•  What  is  the  impact  of  different  defender  TTPs? 

•  How  to  align  workforce  to  cyber  workload? 

•  What  is  the  impact  of  adversary  attack  speed? 

•  What  is  the  impact  of  adversary  attack  timing? 

As  illustrated  in  Figure  3,  we  employ  a  layered  modeling 
structure.  This  allows  inputs  at  both  the  operational  and  cyber 
layers  to  influence  the  behavior  of  the  systems  layer,  to  produce 
a  combined  effect  on  mission  performance. 

Decoupling  via  layers  provides  model  independence,  with 
shared  interfaces.  This  enables  easy  migration  of  missions  and 
cyber  TTPs  as  situations  dynamically  evolve.  Figure  3  is 
notional  only,  and  does  not  include  all  the  model  layers  actually 
in  AMICA.  For  example,  there  are  layers  for  mission 
hierarchical  dependencies,  cyber  vulnerability  dependencies 
(attack  graphs),  etc. 
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Figure  3.  Model  Decoupling  via  Layered  Structure 


Figure  4  shows  the  architectural  structure  of  our  AMICA 
implementation.  This  illustrates  AMICA’ s  novel  approach  for 
blending  workflow  modeling  with  mission  dependencies  and 
attack  graphs.  Each  modality  (process-based  and  graph-based) 
captures  a  different  aspect  of  the  overall  picture:  workflow 
(process  modeling)  and  environment  (graph-based  relationships, 
constraints,  etc.).  This  allows  workflow  and  environment 
models  to  be  developed  independently,  aided  by  automatic 
generation  for  a  given  network. 


Modeling  Simulation  Analysis  Visualization 


Figure  4.  AMICA  Architecture 

Behavioral  and  temporal  aspects  of  the  system  (workflow, 
timing  constraints,  required  resources,  etc.)  are  implemented 
through  executable  process  models  and  stochastic  discrete-event 
simulation  (in  iGrafx).  Structural  and  functional  aspects 
(environmental  constraints,  mission  and  system  dependencies, 
event  flows,  etc.)  are  maintained  through  MIT  Lincoln 
Laboratory’s  Network  Environment  Oracle  (NEO),  and 
MITRE’s  Cyber  Command  System  (CyCS)  [13]  and 
CyGraph  [19].  CyCS  contains  a  directed  graph  comprising  the 
information  and  system  dependencies  of  each  mission  function. 
NEO  contains  additional  topological  and  vulnerability 
information  that  is  not  captured  in  CyCS.  CyGraph  provides 
topological  and  attack  graph-focused  visualization  of  the 
environment  and  cyber  attack  progress.  The  initial  state  of  the 
structural  cyber  (attack  graph)  model  is  generated  from  the 


network  topology,  firewall  rules,  and  system  vulnerabilities  via 
the  Government  Off-The-Shelf  (GOTS)  tool  TV  A  [9][10].  In 
this  way,  we  leverage  established  tools  for  dependency 
knowledge  management  and  automated  model  building. 

To  capture  workflows,  decision  points,  workloads, 
resources,  and  temporal  constraints,  AMICA  employs  a 
technique  called  Mission-Level  Modeling  (MLM)  [20].  MLM 
leverages  BPMN  to  define,  refine,  and  verify  operational 
processes,  decisions,  and  information  flows  among 
producer/consumer  systems  and  people.  It  supports  model 
libraries  and  parameterization  to  quickly  assemble  new 
prototypes.  MLM  handles  the  high  degree  of  concurrency 
inherent  in  information- sharing  operations,  and  explores  impacts 
on  MOEs/MOPs  through  simulation  of  mission  models. 

MLM  is  based  on  BPMN  and  discrete-event  simulation, 
implemented  in  iGrafx  (a  commercial  tool).  MLM  replaces 
static  tools  such  as  Visio  and  PowerPoint,  providing  an 
executable,  visual  model  to  support  stakeholder  collaboration  to 
develop  and  validate  new  concepts.  This  provides  a  single 
model  for  qualitative  and  quantitative  analysis,  and  enables  rapid 
prototyping  and  reuse  thorough  a  single  modeling  standard. 

Figure  5  shows  the  operational  flow  among  the  AMICA  sub¬ 
systems.  The  TV  A  tool  [9]  [10]  provides  the  network  topology 
and  vulnerable  attack  paths  through  the  network.  This  represents 
the  initial  state  of  the  network,  before  cyber  attacks  and  defenses 
are  simulated.  TVA  initializes  NEO,  which  maintains  dynamic 
cyber  state  under  simulation  and  provides  choices  for  next 
possible  cyber  states.  Similarly,  CyCS  maintains  dynamic 
simulation  state  for  mission  dependencies. 


At  simulation  time,  iGrafx  simulates  mission  and  cyber 
threads  concurrently,  testing  cyber  and  mission  states  as  needed, 
and  updating  them  when  process  tasks  (i.e.,  cyber  attacker  and 
defender  tasks)  change  environmental  conditions.  For  example, 
when  the  cyber  attacker  process  compromises  a  mission-critical 
machine,  iGrafx  updates  the  node’s  state  in  CyCS  (which 
propagates  to  higher-level  mission  dependencies). 

Similarly,  if  the  cyber  defender  process  repairs  the  machine, 
its  state  is  reset  in  CyCS.  Asynchronously,  mission  tasks  check 
the  appropriate  higher-level  CyCS  nodes  upon  which  they 
depend.  Throughout  the  entire  process,  CyGraph  shows  the 
dynamic  state  evolution  through  animated  visualization. 
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IV.  Case  Study 

For  our  case  study,  we  consider  a  key  mission  within  a 
regional  Air  and  Space  Operations  Center  (AOC).  In  an  AOC, 
an  air  component  commander  provides  top-level  command  and 
control  of  air  and  space  operations.  In  our  case  study,  the 
mission  focus  is  deliberate  kinetic  targeting  [21],  from  basic 
target  development  through  development  and  publication  of  the 
Air  Tasking  Order  (ATO). 

Thus  we  model,  simulate,  and  quantitatively  analyze  the 
impact  of  cyber  attacks  on  the  targeting  mission  (number  of 
targets  successfully  processed)  within  an  AOC.  Our 
parameterized  library  of  AMICA  modules  can  be  rapidly 
reconfigured  to  represent  different  mission,  cyber  threat,  and/or 
cyber  defense  scenarios. 

Figure  6  shows  the  phases  of  progression  for  target 
development.  On-going  target  development  defines  all  possible 
targets  available  for  strike  in  the  area  of  responsibility  (AOR). 
In  preparation  for  an  anticipated  crisis,  advanced  target 
development  reexamines  potential  targets  in  preparation  for 
possible  strike. 
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Figure  6.  Target  Development  and  ATO  Process 


Once  hostilities  actually  begin,  targets  are  nominated  for 
potential  inclusion  in  the  ATO.  Nominated  targets  are 
prioritized,  and  then  a  final  target  is  selected  based  on  available 
delivery  assets.  Targets  are  paired  with  assets,  leading  to  the 
completed  ATO. 

For  this  case  study,  we  leverage  Mission-Level  Modeling 
(MLM)  originally  developed  for  U.S.  European  Command 
(EUCOM)  for  Exercise  Austere  Challenge  2010  [22].  This 
covers  the  targeting  process  from  basic  target  development 
through  the  Master  Air  Attack  Plan  (MAAP)  and  ATO,  as  well 
as  Battle  Damage  Assessment  (BDA). 

This  targeting  model  has  over  200  steps,  with  timing  and 
required  resources  per  step.  The  model  is  organized  as  high- 
level  modules  that  reference  lower-level  reusable  library  models. 
Figure  7  shows  a  high-level  portion  of  this  model. 


Target  Guidance  and  ROE 


TGT  Folder  Development 


TGT  Systems  Analysis 


Continuous  Engagement  Cycle 


1C  Target  Vetting 
&  Validation 


:  -c=>* 

^  f?r 

9*:  : 

jh  ]  hi 

B* 

j 

TfF  yNf[ 

r — H 

Figure  7.  Portion  of  ATO  Target  Development  Model 


In  this  model,  each  target  is  tracked  through  the  target- 
development  process  until  completion,  including  whether  the 
confidentiality  or  integrity  of  the  target  data  was  breached. 
Through  simulation,  we  quantify  mission  performance  and 
effectiveness,  with  metrics  such  as  numbers  of  targets  making 
each  list,  timing  of  each  phase  of  development,  workforce 
utilization,  downtime,  etc. 

Figure  8  shows  a  high-level  portion  of  a  cyber  attacker 
model.  In  this  particular  scenario,  a  phishing  attack  results  in  a 
malware  infection,  giving  the  adversary  an  initial  presence 
inside  the  network.  The  attacker  then  moves  laterally  through 
the  network,  until  a  mission-critical  machine  is  compromised. 
At  that  point,  the  attacker  achieves  the  desired  attack  goal 
(compromising  confidentiality,  integrity,  and/or  availability). 
Depending  on  the  scenario  settings,  the  adversary  may  delay  the 
final  impact  to  coincide  with  a  critical  phase  of  the  mission. 


Figure  8.  Portion  of  Cyber  Attacker  Model 

Figure  9  shows  a  high-level  portion  of  the  cyber  defender 
model.  The  process  is  triggered  by  an  alert  (intrusion  detection 
system,  user  tipoff,  etc.),  followed  by  triage  to  understand  the 
basic  nature  of  the  alert. 
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Depending  on  the  severity  of  the  incident  and  past  history 
with  the  victim  machine,  the  defender  either  reboots  the 
machine,  restores  corrupted  data,  or  rebuilds  the  machine  from  a 
non-compromised  image.  If  an  infection  is  detected  or  a 
machine  is  a  victim  in  multiple  incidents,  the  defender  conducts 
more  in-depth  forensics.  This  involves  searching  for  other 
infections  and  rebuilding  victims  as  needed. 

As  for  the  mission  model,  the  cyber  (attacker  and  defender) 
models  are  modular,  with  higher-level  models  referencing  sub¬ 
models.  That  is,  process  tasks  (boxes)  in  a  given  model  may 
represent  entire  sub-models  defined  elsewhere  in  the  AMICA 
library. 

Our  cyber  model  leverages  previous  collaborative  work  with 
cyber  defenders  to  define  a  process  flow  for  their  operations. 
This  model  captures  adversary  TTPs  for  major  classes  of  attacks 
(email-based,  browser-based,  and  host-based),  with 
corresponding  defensive  TTPs.  This  collaborative  work  has 
produced  a  rich  process  diagram  (in  Visio),  approaching  1000 
steps.  For  AMICA,  we  use  this  as  the  basis  for  an  executable 
model  in  iGrafx. 

As  described  in  Section  III,  the  cyber  attacker  and  defender 
processes  (in  iGrafx)  interact  through  the  Network  Environment 
Oracle  (NEO).  NEO  maintains  state  in  the  cyber  attack  graph, 
which  the  attacker  and  defender  process  models  check  for 
environmental  conditions  required  for  taking  next  steps 
(vulnerabilities,  reachability,  infection  state,  etc.). 

NEO  state  is  reflected  in  Cy Graph  [19],  a  MITRE  tool  for 
cyber  graph  analytics,  interactive  visualization,  and  animation. 
Figure  10  shows  a  representative  attack  graph  in  CyGraph,  with 
infected  machines  in  red  and  rebuilt  machines  in  green. 


Figure  10.  Cyber  Attack  Graph  with  Dynamic  States 

While  NEO  maintains  state  for  cyber-related  assets, 
MITRE’s  Cyber  Command  System  (CyCS)  maintains  state  for 
mission-related  assets.  CyCS  models  mission  dependencies  as 
a  directed  acyclic  graph  (hierarchy).  The  upper  levels  of  the 
hierarchy  are  high-level  mission  assets  (organizations,  major 
work  products,  etc.).  These  are  mapped  to  subordinate  entities 


on  which  they  depend.  Dependencies  can  be  conjunctive 
(Boolean  AND)  or  disjunctive  (Boolean  OR).  At  the  bottom  of 
the  hierarchy  are  those  entities  with  no  subordinates.  Figure  11 
shows  a  representative  mission-dependency  graph,  visualized 
via  CyGraph. 


Figure  11.  Graph  of  Mission  Dependencies 


As  an  example  of  the  quantitative  analyses  available  through 
AMICA,  consider  Table  1.  This  shows  mission  impact  from  a 
simulated  cyber  attack.  In  this  scenario,  the  attack  results  in  loss 
of  availability  of  a  mission-critical  database  service. 


Table  1.  Impact  of  Availability  Attack  (JTL  Targets) 


Cycle 

Without 

Attacks 

With 

Attacks 

Relative 

Impact 

4  days 

9 

1 

88% 

7  days 

21 

1 

95% 

14  days 

76 

70 

8% 

In  this  scenario,  the  attack  occurs  during  routine  operations 
early  in  the  target-development  process.  The  metric  for  cyber 
impact  is  a  mission-based  measure  of  performance  (MOP):  the 
number  of  targets  that  make  the  Joint  Target  List  (JTL).  The 
relative  impact  in  the  table  (in  percent)  is  then 

Impactreiative  100  •  [1  (jlwith  attacks /^without  attacks^]- 

The  experiment  is  to  determine  a  baseline  number  of  JTL 
targets  produced  in  the  absence  of  an  attack,  and  to  compare  that 
to  the  number  of  JTL  targets  produced  when  the  AOC  is  under 
attack. 

The  results  in  Table  1  show  a  dramatic  mission  impact  from 
the  cyber  attack.  Moreover,  the  effects  are  fairly  long-lasting; 
after  a  week,  the  relative  impact  is  still  only  one  JTL  target 
produced  (versus  the  expected  21  targets).  By  the  end  of  the 
second  week  after  attack,  JTL  target  production  is  mostly  caught 
up. 

In  these  experiments,  the  processing  of  each  target  is 
simulated  individually.  At  various  points  in  the  process,  there 
are  certain  conditions,  timings,  etc.,  that  have  some  degree  of 
uncertainty.  These  are  modeled  as  probability  distributions  in 
the  appropriate  points  in  the  model.  In  a  simulation  run,  Monte 
Carlo  analysis  executes  the  stochastic  model  according  to  model 
parameters. 
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Table  2  shows  quantitative  results  from  another  AMICA 
simulation.  This  scenario  is  an  integrity  attack  against  a  critical 
database  during  advance  target  development  (the  phase  that 
prepares  for  an  anticipated  crisis).  The  mission-based  MOP  for 
measuring  cyber  impact  is  the  number  of  targets  added  to  the 
Joint  Integrated  Prioritized  Target  List  (JIPTL). 


Table  2.  Impact  of  Integrity  Attack  ( JIPTL  Targets) 


Cycle 

Without 

Attacks 

With 

Attacks 

Relative 

Impact 

4  days 

574 

303 

47% 

7  days 

1098 

1044 

5% 

14  days 

1098 

1087 

1% 

The  results  in  Table  2  show  that  this  attack  is  less  impactful 
in  terms  of  relative  reduction  in  targets  processed.  Moreover, 
the  AOC  is  able  to  rebound  from  the  attack  more  quickly. 

Figure  12  shows  the  relative  impact  on  mission  performance 
for  the  two  attack  scenarios:  (1)  availability  attack  against 
producing  the  JTL  in  routine  early  development,  and  (2) 
integrity  attack  against  producing  JIPTL  in  advanced  target 
development  in  preparation  for  crisis. 


4  days  7  days  14  days 


■  Impact  (Availability) 
Scenario  1: 
Reduction  of 
Targets  on  JTL 
(In  Routine 
Development) 


■  Impact  (Integrity) 
Scenario  2: 
Reduction  of 
Targets  on  JIPTL 
(During  Crisis 
Preparation) 


Figure  12.  Relative  Impact  for  Two  Attack  Scenarios 

Of  course,  not  all  target-production  numbers  may  be  equally 
important.  For  example,  the  criticality  of  the  development  phase 
itself  may  be  a  strong  factor  in  overall  impact.  But  it  is  clear  that 
AMICA  provides  a  quantitative  approach  to  address  these  kinds 
of  questions,  based  on  simulation  of  vetted  models  for  missions 
and  cyber  TTPs. 

We  are  investigating  a  range  of  more  advanced  attacks 
against  different  portions  of  the  targeting  process,  such  as  data 
alterations  that  interfere  with  battle  damage  assessment,  move 
target  locations,  inject  discrepancies  that  force  massive  rework, 
etc. 


V.  Summary  and  Next  Steps 

We  have  described  an  integrated  approach  for  quantitative 
analysis  of  mission  impact  from  cyber  attacks,  known  as 
AMICA  (Analyzing  Mission  Impacts  of  Cyber  Actions). 
AMICA  defines  process  models  for  mission  threads  and  cyber 
tactics,  techniques,  and  procedures  (TTPs).  These  process 
models  are  designed  as  a  hierarchically-decomposed  library  of 
reusable  modules,  for  rapid  reconfiguration  and  prototyping. 

AMICA  process  models  are  probabilistic  and  executable, 
supported  by  discrete-event  simulation  and  stochastic  Monte 
Carlo  analysis.  Through  simulation  of  mission  and  cyber 
models,  we  are  able  to  quantitatively  assess  mission  impact  from 
cyber  attacks.  Monte  Carlo  analysis  provides  distributions  over 
multiple  simulation  runs,  for  bounding  uncertainty  in  results. 
For  process  modeling  and  simulation  we  apply  industry- standard 
Business  Process  Modeling  Notation  (BPMN)  implemented  in  a 
commercial  tool  (iGrafx). 

While  process  models  capture  workflow  and  behavioral 
phenomena,  processes  necessarily  operate  within  the  structural 
constraints  and  dependencies  of  a  particular  environment.  This 
includes  dependencies  between  mission  requirements  and  cyber 
assets,  as  well  as  constraints  on  attacker  freedom  of  movement. 
We  capture  these  through  graph  models  (mission-dependency 
graphs  and  attack  graphs),  which  are  dynamically  updated  under 
process-model  simulation. 

This  novel  merging  of  M&S  modalities  supports  dynamic 
simulation  while  leveraging  established  tools  for  cyber/mission 
knowledge  management  and  automatic  model  building  (e.g., 
attack  graphs).  Through  simulation  of  this  integrated  multi¬ 
modal  model,  AMICA  quantifies  cyber  impacts  in  terms  of 
mission-based  measures,  for  desired  mission  and  threat 
scenarios.  We  provide  animated  visualizations  of  simulation 
runs,  showing  environmental  state  changes  during  the  interplay 
of  cyber  force-on-force  warfare. 

We  demonstrate  AMICA  through  a  case  study,  showing 
cyber  impacts  against  a  particular  kinetic  mission:  targeting  for 
Air  Tasking  Order  (ATO)  development  in  an  Air  and  Space 
Operations  Center  (AOC).  We  model,  simulate,  and  quantify 
the  impact  of  cyber  attacks  on  the  targeting  mission.  We  show 
impact  results  for  two  attack  scenarios  (availability  and 
confidentiality)  against  different  phases  of  the  target- 
development  process.  Our  simulations  quantify  cyber  impact  in 
terms  of  mission-relevant  measures  (numbers  of  targets 
completed)  over  time. 

In  the  future,  we  plan  to  develop  a  more  rigorous 
experimental  framework  for  posing  hypotheses,  designing 
experiments,  and  validating  results.  The  goal  is  to  provide  a  rich 
and  agile  environment  for  gaining  scientific  insights.  Examples 
of  such  hypotheses  include: 

•  Levels  of  Fidelity:  Given  a  threat  model,  what  is  the 
right  level  of  fidelity  to  predict  mission  impact  with 
sufficient  accuracy? 

•  Threat  Classes :  For  a  given  set  of  threat  classes,  what 
level  of  coverage  is  sufficient  to  maintain  mission 
readiness? 
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•  Attacker  TTPs :  What  is  the  right  degree  of  automation 
to  achieve  a  desired  mission  impact?  How  much 
knowledge  is  required  for  a  desired  impact? 

•  Attack  Dynamics'.  When  should  the  adversary  attack  to 
have  the  highest  mission  impact?  Which  attack  mode 
(e.g.  fast  smash-and-grab  or  slow-and-stealthy)  can 
cause  greater  mission  impact?  How  many  concurrent 
attacks  can  the  mission  withstand? 

•  Defense  TTPs :  Under  what  conditions  are  static 
defenses  inadequate?  What  is  the  best  combination  of 
static,  dynamic,  and  synergistic  defenses? 

•  Attack  Surface  and  Resiliency :  What  degree  of 
diversity  gives  adequate  protection  against  zero-day 
attacks?  What  is  the  right  balance  between  diversity, 
redundancy,  containment,  and  cost? 

Overall,  AMICA  merges  cyber  and  kinetic  domains  (mission 
threads,  cyber  TTPs,  network  environment,  etc.)  into  a  common 
M&S  environment,  with  complementary  modeling  modalities 
(process-based  and  graph-based).  This  provides  a  strong 
foundation  for  answering  these  kinds  of  questions. 
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Modeling  Risk  and  Agility  Interaction  on 

Tactical  Edge 

James  R.  Morris-King  and  Hasan  Cam 


Abstract —  Cyber-risk  models  often  explore  exploitation 
methods,  agility  maneuvers,  and  mitigation  techniques  to  reduce 
vulnerabilities/counter  risks.  Cyber-agility  models  employ 
quarantine  and  inoculation-like  maneuver  procedures  to  protect 
vulnerable  systems  from  a  known,  detected  threat.  Although 
fairly  effective,  these  procedures  often  diminish  network  function 
in  tactical  environments  which  adversely  impact  mission 
assurance  and  increases  system  damage  beyond  the  exploit  itself. 
This  paper  proposes  a  novel  risk-classifier  model  which  assesses 
influence  to  ensure  tactical  edge  networks  function  during  an 
attack  by  preserving  critical,  high-risk  nodes.  Unlike  other  risk 
assessment  strategy  models,  our  model  employs  temporal 
propagation  graphs  to  capture  the  impact  of  vulnerability 
exploits.  These  high-risk  nodes  are  supported  by  an  agility 
process  that  reacts  to  an  attack  by  quarantining  exploited 
systems  and  designating  viable  successors  to  carry  on  key  mission 
functions  with  varying  degrees  of  service  availability.  We  validate 
this  model  via  an  agent-based  simulation.  Our  simulation  results 
indicate  that  risk  analysis-supported  agility  maneuvers 
outperform  reflexive  strategies. 


Index  Terms — tactical  edge  network,  risk,  agility,  agent-based 
simulation,  ecological  modeling,  epidemic  system,  risk 
propagation 


I.  Introduction 

ithin  the  tactical  edge  paradigm,  battlespace  agility  is  a 
warfighting  concept  defined  as  the  speed  at  which  the 
warfighting  organization  is  able  to  transform  knowledge  into 
actions  for  desired  effects  in  a  battlespace  (Libicki  &  Johnson, 
1996;  Mitchell,  2012a).  The  deterministic  nature  of  cyber¬ 
physical  battlespace  creates  asymmetry  in  cyber  warfare  at  the 
tactical  edge.  This  determinism  allows  adversaries  to  plan, 
coordinate  and  launch  attacks  effectively,  while  defenders  lack 
the  capabilities  to  predict  attack  strategies  or  react  in  a  timely 
fashion  (Mitchell,  2012b).  The  growing  need  to  respond 
quickly  to  cyber  threats  in  the  modern  battlespace  presents 
many  challenges  to  operators  concerned  with  preserving 
integrity  and  mission-assurance  of  cyber  Command  &  Control 
(C2)  systems.  One  challenge  that  this  paper  addresses  is  how 
to  minimize  the  adverse  impact  of  vulnerability  exploitations 
in  critical  nodes  by  employing  network  influence  assessment 
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to  choose  the  best  practical  agility  maneuver. 

Network  influence  is  a  concept  drawn  from  social 
networking  theory  and  is  defined  in  Kempe  et  al.  (2003)  as  the 
extent  to  which  individuals  are  likely  to  be  affected  by 
decisions  of  their  neighbors.  In  the  tactical  edge,  influence  is 
considered  as  a  measure  of  one  cyber-physical  system’s  (CPS) 
ability  to  control  the  behavior  and  state  of  other  systems.  In 
this  context,  cyber  agility  is  a  reasoned  modification  to  a  CPS 
in  response  to  a  functional,  performance,  or  security  need 
(McDaniel  et  al.,  2014).  Perhaps  the  greatest  need  for  agility 
can  be  found  in  formulating  agility  maneuvers ,  or  strategies  to 
mitigate  damage  to  cyber  networks  when  CPS  are 
compromised  by  an  attacker.  There  exists  a  suite  of  known 
metrics  for  evaluating  the  scope  and  effective  impacts  of  cyber 
agility  maneuvers  (Pfister,  2012).  These  metrics  include 
robustness,  resilience,  responsiveness,  and  adaptation 
measures  which  are  used  for  impact  assessment  rather  than 
future  decision-making.  Traditional  models  of  cyber  agility 
impact  assessment  do  not  consider  properties  such  as  mobility, 
temporality,  and  environmental  interference  when  evaluating 
threats  to  CPS  (Riley  &  Ammar,  2002).  This  makes  the 
development  of  predictive  analytical  models  difficult,  and 
represents  an  open  challenge  in  cyber  risk  assessment  (CRA). 

Tactical  edge  networks  share  many  properties  in  common  with 
traditional  mobile  ad-hoc  networks  (MANET),  which  are 
defined  by  Burbank  et  al.  (2006)  as  “deployed  networks 
supporting  users  and  platforms  within  the  tactical  operations 
region”.  They  are  often  sparse  and  dynamical,  consisting  of  a 
heterogeneous  mixture  of  various  autonomous  and  human- 
operated  networked  systems.  When  attackers  penetrate  these 
networks,  it  is  the  role  of  cyber  operators  and  network 
specialists  to  devise  and  execute  countermeasures,  or  agility 
maneuvers,  to  mitigate  these  attacks  and  recover  damaged  or 
compromised  systems.  An  agility  maneuver  may  operate  in 
any  of  the  domains  available  to  socio-technical  security 
models  (physical,  virtual,  cognitive,  and  policy).  Regardless  of 
the  selected  domain,  the  purpose  of  the  agility  maneuver 
remains;  altering  the  vulnerability  landscape  such  that  the  risk 
of  present  and  future  attacks  are  degraded  or  eliminated. 

In  order  to  properly  relate  risk  and  agility,  it  is  necessary  to 
develop  cyber-social  models  that  are  able  to  classify  risk 
controllers  in  the  cyber  environment  and  demonstrate  the 
effect  of  agility  maneuvers  on  mission-critical  vulnerabilities. 
To  that  end,  we  propose  a  risk-classifier  model  that 
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incorporates  mission-assurance  evaluation,  criticality,  and 
propagating  risk  analysis.  Our  contributions  are  multifold.  We 
propose  an  ecology-inspired  influence  metric  and  a  cost 
function  to  approximate  the  impact  to  the  network  of  a 
propagating  cyber-attack.  This  metric  is  then  used  to  support 
an  agility  maneuver  which  selects  successor  host  nodes  with 
minimal  risk.  We  include  the  physical  and  temporal  natures  of 
cyber  risk  on  the  tactical  edge  by  incorporating  system 
mobility  as  well  as  incorporating  power  and  bandwidth 
constraints  in  the  model.  The  response  of  the  network  to 
propagating  exploit  is  measured  in  terms  of  cascading  damage 
to  mission-critical  nodes  and  the  cost  of  mitigation  and 
recovery  operations.  We  then  validate  the  model  through  the 
development  of  a  multi-agent  simulation. 

The  remainder  of  this  paper  is  organized  as  follows.  Relevant 
work  is  discussed  in  Section  II.  Section  III  presents  our  risk 
model  for  tactical  edge.  Section  IV  provides  how  risk 
influence  can  be  used  to  drive  optimal  agility  maneuver 
selection.  Section  V  describes  simulation  model  and  results. 
Concluding  remarks  are  made  in  Section  VI. 

II.  Relevant  Work 

Mattson  (2007)  highlighted  the  need  for  new  cyber  models 
which  included  the  impact  of  the  use  mobile  devices  have  on 
network  security.  Libicki  (2007)  proposed  the  notion  that  three 
distinct  layers  must  be  represented  in  models  of  cyber  systems 
the  physical  layer,  the  syntactic  layer,  and  the  semantic  layer. 
These  layers  possess  distinct  attributes  which  allow  models  to 
display  meaningful  interactions.  Shapiro  &  Varian  (1999)  note 
the  complementary  network  effects  of  adding  or  removing 
highly  critical  network  nodes  in  distributed  systems.  Fortson 
(2007)  highlights  a  number  of  deficiencies  of  common  CRA 
practices  and  highlights  various  objectives  for  impact  analysis 
which  include:  documentation  of  dependency  relationships; 
ability  to  show  effects  of  timing  and  duration  of  attacks  on 
cyber  targets,  and  prediction  of  mission-impact.  A  wide 
variety  of  reward-based  system  dependability  and 
performance  measures  are  discussed  in  Sanders  &  Meyers 
(1991)  and  Trivedi  (2001).  Various  proactive  mitigation 
maneuvers  were  explored  by  Haadi  et  al  (2014)  who  proposed 
a  novel  moving-target-defense  strategy  which  was  evaluated 
via  deterrence,  deception,  and  detectability  metrics.  Whiteman 
(2008)  and  others  have  proposed  tools  for  performing  CRA 
which  leverage  simulation  and  automated  mission-plan 
validation.  However  these  models  have  little  use  in  predicting 
multi-stage  propagating  exploits  (Yu,  2013). 

III.  Problem  Statement 

We  consider  a  mobile  network  as  a  tactical  edge  that  has  n 
mobile  nodes,  each  corresponding  to  a  vehicle  and/or  user 
with  devices.  Each  device  can  have  one  or  more  assets  (e.g., 
software,  hardware,  data,  service).  This  network  is  represented 
as  a  directed  graph,  denoted  by  G  =  (V,E),  where  Vis  the  set 
of  n  vertices,  and  is  the  set  of  all  directed  edges 

representing  the  connections  between  mobile  nodes.  A 
directed  edge  (i,j)  from  node  i  to  node  j  exists  iff  node  i  can 


transmit  to  node  j  directly.  When  an  asset  of  node  k  is 
infected,  exploited,  or  suffers  a  fault,  this  failure  has  the 
potential  to  influence  all  of  k’s  neighbors,  based  on  their 
individual  susceptibility.  The  influence  exerted  on  node  j  from 
an  exploit  or  fault  at  node  i  is  expressed  via  an  influence 
metric  where  =  [ 0,1]  •  The  directed  graph  of 
influence  is  denoted  as  IG  =  [x/j]  . 

The  local  composite  of  directed  influence  is  described  by 
the  variables  pk  and  Tk  ,  where  pk  represents  the  sum  of  xjj  on 
all  outgoing  edges  from  node  k  in  IG ,  and  xk  represents  the 
sum  of  xp  on  all  incoming  edges  terminating  at  node  k  in  IG.  If 
Pk  >  Tfc,  than  node  k  is  said  to  be  a  controller  node  in  the 
network  graph,  else  if  pk  <  zk  than  node  k  is  said  to  be 
dependent  node  in  the  network  graph. 

Criticality  is  assessed  as  a  function  of  the  number  of  shared 
assets  on  node  k,  the  relative  importance  of  those  assets  to  the 
operation,  the  nodes  which  access  these  assets,  and  the 
communication  paths  which  utilize  node  k  as  a  hub  or  sink.  It 
can  be  represented  by  the  following  equation: 
yk  =  (Ak  *  S )  +  DNk  +  ICPk  (1) 

where  Ak  represents  the  available  assets  at  node  k,  S  is  a 
scalar  modifier  representing  the  importance  of  those  assets, 
DNk  represents  the  number  of  nodes  which  treat  node  k  as 
host,  and  ICPk  represents  the  communication  paths  which  pass 
through  or  terminate  at  node  k.  From  this  we  say  that  there 
exists  a  set  of  critical  nodes  B  such  that  B  Q  G.  A  node  k  E  B 
iff  yk  >  Y],  where  p  is  an  arbitrary  criticality  threshold 
determined  at  the  start  of  an  operation.  The  set  of  nodes  in  B 
may  change  over  time.  This  variability  is  subject  to  network 
topology  changes  resulting  from  node  mobility,  node  loss  via 
malicious  exploit  or  power  depletion,  and/or  cyber  agility 
maneuver  by  network  managers. 

Criticality  also  functions  as  a  component  of  the  impact 
measure  Ik ,  which  indicates  the  suffered  by  the  network  if  this 
node  is  lost.  This  measure  is  expressed  as: 

4  =  (2) 

k  YG 

where  Ik  represents  the  damage  to  the  system  from  the  loss  of 
node  k  and  yG  is  the  criticality  of  the  network.  yG  is  expressed 
as: 

Yg  =  (SisnKi)  (3) 

where  yk  is  the  criticality  of  node  k,  n  is  the  number  of  active 
nodes  in  the  network,  and  yi  is  the  criticality  of  the  ith  node. 

Performing  agility  maneuvers  on  network  nodes  engenders 
a  cost  measured  by  C,  which  is  expressed  by  the  following 
equation: 

C  =  D*PR*ET  (4) 

where  D  represents  the  relative  sophistication  or  expertise 
required  to  perform  the  maneuver,  PR  represents  the  financial 
investment  (in  terms  of  parts  and  labor)  of  performing  the 
maneuver,  and  ET  represents  an  estimate  of  time  required  to 
complete  the  maneuver. 

Using  these  measures  we  are  able  address  the  following 
problems: 
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Problem  1 :  Assess  influence  and  propagation  of  vulnerability 
exploitation  in  tactical  edge  by  incorporating  its  main  features 
(mobility,  power,  bandwidth,  delay,  etc.)  and  identify  high- 
risk  nodes  based  on  their  criticality. 

Problem  2:  Characterize  a  set  of  basic  cyber  agility 
maneuvers  based  on  common  network  security  practices  with 
respect  to  I  and  C.  Show  via  simulation  how  these  maneuvers 
can  be  deployed  under  cost  and  resource-constrained  scenarios 
using  influence  as  a  classifier  heuristic  to  minimize  total 
network  impact  (7). 

IV.  Influence  Model  for  Tactical  Edge 

The  influence  model  for  the  tactical  edge  environment  is 
formed  from  the  union  of  two  sets  of  operational 
requirements;  mission  assurance  and  criticality. 

A.  Mission  Assurance 

To  represent  mission  assurance  across  nodes,  we  modify  the 
Mission-Service-User  (MSU)  model  proposed  by  D’Amico  et 
al.  (2010)  and  Cam  and  Mouallem  (2012).  In  this  view,  cyber 
operations  have  four  critical  components: 

1)  Basic  assets,  such  as  data  stores,  routers,  networked 
sensors,  etc. 

2)  Services  rely  on  assets  to  deliver  information  or 
capabilities  across  the  network 

3)  Tasks  rely  on  time-sensitive  service  availability  to 
accomplish  mission-critical  activities 

4)  Missions  are  composed  as  a  series  of  interrelated 
tasks  organized  to  accomplish  some  tactical  goal 

These  four  components  form  the  Mission-Task- Service- 
Asset  model.  The  integrity  of  these  four  control  regimes 
mimics  the  hierarchal  interdependence  of  a  natural  ecosystem, 
where  the  provisioning  of  sensitive  ecological  services  is 
predicated  on  a  complex  web  of  heterogeneous  species  and 
environmental  interactions.  Like  its  biological  counterpart,  the 
mission  ecosystem  is  vulnerable  to  cascading  disruption  at 
each  level  of  control. 

From  the  purview  of  assurance,  mission  success  is  predicated 
on  the  timely  completion  of  mission-critical  tasks.  These  tasks 
rely  on  service  availability,  which  in  turn  relies  on  asset 
availability.  Assets  and  services  are  maintained  and  delivered 
by  network  nodes,  whose  accessibility  may  vary  based  on 
endogenous  (internal  state)  or  exogenous  (environmental) 
factors.  Because  a  service  on  one  node  may  require  assets  held 
by  another  node,  service  availability  is  as  much  a  function  of 
network  topology  as  it  is  individual  system  integrity. 

B.  Criticality 

Our  conceptual  model  of  node  risk  relies  on  an 
understanding  of  node  influence  pk  and  rfc,  which  are  derived 
from  a  node’s  asset  and  service  relationships  with  the  broader 
network  and  as  well  as  its  topology  of  the  graph.  More 
formally,  influence  in  the  tactical  network  ecosystem  is  a 


composite  function  of  a  node’s  local  neighborhood  (in-degree, 
out-degree,  and  betweenness),  the  nodes  across  the  network 
that  rely  on  services  it  provides,  and  its  status  as  a  hub  for 
communication  between  non-adjacent  nodes.  For  example,  a 
vehicle-mounted  network  server  which  provides  services  to 
nearby  warfighters  is  tasked  with  maintaining  a  minimum 
number  of  active  connections  in  order  to  satisfy  some  time- 
sensitive  task  (such  providing  access  to  mission-specific 
intelligence  or  web-servers).  Failure  of  the  network  server  not 
only  results  in  the  loss  of  function  of  that  device  but  also  the 
loss  of  function  of  any  CPS  relying  on  it  to  connect  to  other 
devices.  Additionally,  any  mission  with  a  requirement  that 
access  to  that  node  be  maintained  may  be  delayed  or 
compromised  due  to  the  time  and  cost  of  recovery.  Thus,  the 
network  server  node  exerts  influence  not  just  in  the  network 
topology,  but  also  in  the  physical,  social,  and  mission- 
assurance  domains.  We  capture  this  influence  as  follows: 

•  Let  q  be  the  connected  neighbors  of  node  k,  where  an 
edge  between  two  nodes  is  determined  to  exist  if  the 
source  node  is  a  member  of  the  subset  of  local 
neighbors  using  /cas  a  hub. 

•  Let  r  be  the  number  of  unique  nodes  that  rely  on 
services  from  node  k. 

•  If  q  <  r,  then  yk  =  \q  —  r\ 

•  Else  if  q  >  r,  then  yk  =  —  1  *  \q  —  r\ 

While  this  relation  is  useful,  it  does  not  distinguish  between 
nodes  whose  criticality  is  dictated  by  centrality.  Centrality 
concepts  were  first  developed  in  social  network  analysis  and 
are  used  to  identify  the  most  influential  hubs  in  social 
networks  and  key  infrastructure  nodes  in  the  Internet  or  urban 
networks.  In  viral  outbreak  models  this  measure  is  used  to 
identify  the  main  spreaders  of  infectious  disease  in  a 
population.  There  are  several  common  measures  for 
determining  degree  centrality  in  networks  which  have 
particular  affordance  for  our  problem,  determining  important 
nodes  in  a  tactical  edge  environment.  For  the  purpose  of 
simplicity,  we  focus  on  betweenness  as  the  centrality 
component  of  our  criticality  metric. 

Betweenness  centrality  quantifies  the  number  of  times  a 
node  acts  as  a  bridge  along  the  shortest  path  between  two 
other  nodes.  It  was  introduced  as  a  measure  for  quantifying  the 
control  of  a  human  on  the  communication  between  other 
humans  in  a  social  network  by  Freeman  (1977);  however,  in 
our  model,  we  view  betweenness  as  a  marker  for  whether  the 
node  in  question  behaves  as  a  risk  controller  in  the  battlefield 
network.  The  formula  for  determining  betweenness  of  node 
vin  graph  G\=  ( V ,  E )  with  V  vertices  follows: 

Wb  (5) 

°st 

where  ast  is  total  number  of  shortest  paths  from  node  sto  node 
tand  C7st(v)is  the  number  of  those  paths  that  pass  through  v. 
Of  course,  this  formulation  rests  on  the  assumption  that 
routing  between  nodes  follows  a  shortest-path  strategy.  Other 
routing  strategies  or  external  constraints  might  make  Brandes’ 
variation  of  the  algorithm  more  suitable,  because  it  corrects 
for  edges  being  counted  multiple  times  (Brandes,  2001). 
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V.  TEMPORAL  RISK  EVALUATION 
A.  Risk  Model 

Risk  magnitude,  Rh  is  the  measure  of  risk  entering  an 
ecosystem  compartment  i  at  a  given  time  interval  t.  This  risk 
can  be  self-generated  (endogenous),  via  such  conditions  such 
as  equipment  failure,  software  error,  and  accidental  misuse  by 
a  user,  or  it  may  arise  from  external  sources  such  as  other 
compartments  (nodes)  and  the  environment  itself  (exogenous). 
Risk  magnitude  is  further  separated  into  three  parameters:  risk 
intensity  ( RIX ),  probability  of  risk  occurrence  ( Px ),  and 
compartment  sensitivity  (Si).  Together,  these  parameters 
determine  the  input  risk  value  (Rt)  at  a  given  node  as  follows: 

R.  =  Rlx  *  Px  *  Sit  0  <  Rt  <  1  (6) 

Where  RIX refers  to  the  risk  intensity  resulting  from  a  state 
change  caused  by  the  exploitation  of  vulnerability  x ,  Px  refers 
to  the  probability  of  that  exploit  occurring,  and  St  is  a  constant 
representing  the  degree  of  sensitivity  of  compartment  i.. 
Taken  together,  risk  magnitude  becomes  useful  shorthand  for 
identifying  dominant  compartments  in  tactical  networks. 


Fig.  1.  VTG  model  (t=  1 0)  of  a  three-node  system  with  a  single  shared  asset 
vulnerability,  V1  at  A±. 


To  better  understand  the  temporal  nature  of  vulnerability 
exploitation  in  networked  systems  we  developed  a 
vulnerability  timing  model  for  tactical  network  nodes  based  on 
hybrid  failure  propagation  modeling.  The  resultant 
vulnerability  timing  graphs  (VTG)  illustrate  the  temporal  and 
probabilistic  nature  of  system  vulnerabilities  that  propagate 
between  nodes.  The  VTG  illustrates  the  timing  window  of 
events,  r  =  [Vminymax\>  as  well  the  probability  of 
propagation,  P  (Nt  ,Nj). 

B.  Risk  Mitigation 

Minimal  risk  optimization  strategies  are  procedures  that  move 
the  entire  system  towards  the  most  secure  state  possible  with 
least  risk  overall.  In  traditional  cyber-security  analysis,  the 
system  is  modeled  using  two  states,  i.e.,  secure  and  insecure 
(compromised).  Actions  which  move  the  system  from  a  state 
of  high-risk  to  a  state  of  lower-risk  while  preserving  function 
are  known  as  minimal-risk  maneuvers.  Selecting  actions 
which  consistently  perform  this  transition  is  hard  in  the 
presence  of  uncertain  information  and  random  processes. 


Suspending  or  terminating  a  service  component  is  oftentimes 
desirable  if  it  protects  the  larger  system,  but  it  is  harmful  in 
response  to  a  false  alarm.  Deliberate  triggering  by  a  malicious 
adversary  might  also  cause  self-inflicted  denial-of-service. 

In  a  control-theoretic  model,  the  system  consists  of  two 
features:  (1)  a  discrete-time  dynamic  system  and  (2)  a  cost 
function  that  is  additive  over  time.  The  cost  function  is 
additive  in  the  sense  that  the  cost  incurred  accumulates  over 
time.  However,  because  of  the  presence  of  uncertainty  in  the 
state,  the  cost  is  generally  presented  as  a  random  variable 
which  cannot  be  meaningfully  optimized  (Rowe  et  al.,  2013). 
While  optimizing  cost  may  be  difficult,  it  is  possible  to 
calculate  maneuver  costs  by  incorporating  changes  in  risk 
between  system  states. 

As  indicated  earlier,  we  assess  temporal  risk  in  tactical 
networks  by  computing  R  for  each  node  in  the  network  at 
each  timestep.  This  can  be  combined  with  the  impact  measure 
4  and  the  criticality  score  y  to  identify  vulnerable  nodes 
evaluate  the  functional  cost  of  a  mitigation  maneuver  after  a 
node  is  compromised.  The  network  evaluation  operation  can 
be  summarized  as  follows: 

1 .  Scan  graph  G  for  in  infected  nodes 

2.  If  a  node  is  infected,  add  it  to  the  set  of  nodes 
pending  treatment,  INFECTED. 

3.  If  a  node  is  not  infected,  add  it  to  the  set  of 
susceptible  nodes,  SUSCEPTIBLE 

4.  Calculate  R  and  y  for  each  node  in  the  graph 

Using  these  observations  we  formulate  an  agility  maneuver 
with  the  goal  of  mitigating  future  risk  to  the  network  via  a 
replacement  operation.  We  set  an  arbitrary  threshold  r]  such 
that  any  node  with  y  >  rj  is  considered  ‘critical’.  From 
INFECTED ,  select  the  critical  node  with  the  greatest  I  and 
add  it  to  the  set  QUARANTINED.  Then  select  a  new  node 
from  SUSCEPTIBLE  as  a  successor  iff  it  is  eligible.  Nodes 
are  considered  eligible  for  succession  if  they  are  able  to 
provide  similar  capabilities  to  those  which  were  provided  by 
the  quarantined  node.  Depending  on  the  constraints,  this 
process  may  be  repeated  until  INFECTED  is  empty,  time 
elapses,  and/or  some  finite  resource  measure  is  exhausted. 


W  (» 


Fig.  2.  Selecting  a  new  critical  node  after  infection,  (a)  pre-exploit:  nodes  X, 
Y,  and  Z  are  critical  nodes  with  Y  being  selected  and  active,  (b)  An  exploit 
occurs  on  node  Y,  leading  to  a  quarantine  operation,  (c)  post-exploit:  node  Z 
retains  the  least  risk  and  is  selected  to  replace  Y. 
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Consider  the  scenario  represented  above  (fig.  1)  on  graph  G 
where  X,  Y,  and  Z  are  nodes  which  provide  mission-critical 
services  to  the  network.  Y  is  initially  selected  as  the  critical 
node  due  to  its  higher  C-value.  Nodes  X ,  Y,  and  Z  provide 
similar  services  across  the  network.  In  this  example,  let  RY  < 
Rx  <  Rz  where  Rn  represents  the  composite  risk  at  node  n.  In 
this  example,  after  the  loss  of  Y,  Xis  selected  as  the  new 
source  node. 


TABLE  I 

TANVIS  Experimental  Results  (By  Group) 


Experiment  Group 

Risk  Force 

Intensity 

Cohesion 

Stationary,  No  Exploit 

0.0 

0.0 

1.0 

Stationary,  Exploit 

0.71311 

13.4146 

0.35102 

Mobile,  No  Exploit 

0.0 

0.0 

0.44439 

Mobile,  Exploit 

0.014 

8.2112 

02.1718 

This  particular  maneuver  mimics  the  function  of  real-life 
ecosystems,  which  often  adjust  to  the  loss  of  ecological 
compartments,  services,  or  species  by  exposing  or  promoting 
alternative  ecological  niches  which  can  serve  the  remaining 
population  in  a  similar  fashion.  From  the  purview  of  cost 
analysis,  this  maneuver  is  considered  a  ‘sealing  off  or 
quarantine  function  where  the  security  of  the  node  is 
unchanged  while  its  capability  is  reduced. 

C.  Recovery 

Restoring  function  to  lost  nodes  may  also  follow  the  same 
process,  where  critical  nodes  under  quarantine  may  be 
evaluated  for  healing  operations  (e.g.  self-healing,  patching, 
etc.)  before  being  allowed  to  rejoin  the  network.  Consider  the 
cost  function  discussed  in  (4),  risk  magnitude  in  (6),  and 
impact  score  in  (2).  We  can  inject  these  measures  risk  analysis 
as  follows: 


Where  x  is  an  agility  maneuver  performed  on  node  n,  R 
represents  the  magnitude  of  risk  at  node  n,  /  represents  the 
impact  of  losing  node  n,  and  C  represents  the  cost  of 
performing  agility  maneuver  x.  This  formula  allows  us  to 
interpret  the  relationship  between  various  operations  with 
respect  to  their  cost.  For  example,  the  trade-off  between 
quarantine,  self-healing,  and  patching  can  be  represented  with 
the  following  relationship: 


A.  Scenario 

Consider  a  propagating  exploit  where  a  set  of  infected  nodes 
/  E  Care  the  subject  of  a  propagating  exploit  at  time  t  =  0. 
Observation  of  infected  nodes  by  network  operators  may  not 
be  immediate,  and  is  controlled  by  an  observation  probability 
OPx  which  scales  with  respect  to  length  of  infection.  Network 
operators  are  cost  constrained  and  may  only  spend  an  arbitrary 
amount  of  their  budget  to  respond  to  observed  exploits.  Let 
B  Q  G  be  the  set  of  critical  nodes.  At  each  timestep  t, 
calculate  the  risk  graph  RF  and  label  controller  and  dependent 
nodes.  For  all  observed  nodes,  select  k  where  k  E  /  AND  R,  k 
maximizes  p,  and  minimizes  C.  If  C(k)  <  budget 


Time  (ticks) 

- Mobile  - Non-Mobile 

Fig.  3.  Composite  network  risk  (R)  across  mobile  and  non- 
mobile  network  models  during  a  propagating  worm  attack. 


pn(quarantine )  <  pn(s  elf  healing)  <  pn(patching) 

This  formula  can  be  further  refined  to  include  terms  covering 
the  various  cost-modifiers  and  constraints  of  network 
operations  such  as  power,  bandwidth,  operator  training, 
infrastructure  repair,  and  protocol  &  policy  development. 

VI.  Simulation  &  Experimentation 

In  this  section,  we  give  some  preliminary  results  for  influence 
and  cohesion  scores  for  both  mobile  and  non-mobile 
experimental  models.  These  results  are  intended  to  illustrate 
the  capabilities  of  our  MANET  model  to  accurately  replicate 
the  behavior  of  propagating  attacks  on  tactical  nodes,  and  not 
intended  to  be  a  comprehensive  study  of  vulnerability 
exploitation  on  such  systems.  The  values  in  Table  2  show  the 
effect  of  mobility  on  risk-force,  risk-intensity,  and  network 
cohesion. 


Fig.  4.  Worm  exploit  simulation  with  SIR  prediction. 

Total  risk  force  (Fig.  3)  follows  a  predictable  pattern  with 
respect  to  change  in  infected  population  predicted  by  the 
Standard  Epidemic  Model,  with  peak  vulnerability  occurring 
in  network  topologies  that  with  high  levels  of  cohesion.  This  is 
understandable,  as  sparse  networks  create  compartments 
which  are  isolated  from  exploit  due  to  distance  or  complete 
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inaccessibility.  This  phenomenon  is  common  to  topologies 
displaying  small-world  characteristics  (most  nodes  are  not 
neighbors  of  one  another,  but  most  nodes  can  be  reached  from 
every  other  by  a  small  number  of  hops  or  steps)  and  low 
cohesion.  The  reduced  risk  force  and  intensity  indicate  a  sort 
of  topological  resistance  which  results  node  mobility. 
Likewise,  average  risk  intensity  was  lower  in  mobile  networks 
as  nodes  dispersed  prior  to  contact  with  infected  neighbors. 
The  average  subnetwork  size  for  Random  Waypoint  was 
4.6211  (compared  to  12  in  the  fixed  network),  which  is  in 
keeping  with  network  cohesion.  This  can  be  explained  was 
lowest  in  mobile  network  suffering  from  worm  attack  due  to 
the  combined  loss  of  node-connectivity  from  mobility  and 
exploit. 


Fig.  5.  Average  service  availability  in  mobile  and  fixed  network  models. 


VII.  Conclusion 

In  this  work  we  present  a  design  and  implementation  of  a 
model  for  risk  analysis  of  agility  maneuvers  in  a  simulated 
tactical  network  environment.  This  model  was  evaluated  via 
agent-based  simulation  of  a  theoretical  tactical  environment 
including  mobile  warfighters,  their  attendant  digital  devices, 
and  an  automated  network  analysis  module.  Findings  from 
preliminary  experimentation  indicate  that  risk-based  agility 
maneuvers  such  as  quarantine  and  patching  operations 
increase  mission  assurance  by  maintaining  network  function 
even  in  scenarios  where  battery  power  and  bandwidth  limit  the 
ability  of  network  operators  to  reach  every  vulnerable  node. 
Possible  future  improvements  include:  the  development 
intelligent  mobility  models,  alternative  asset  distribution 
across  nodes,  advanced  behavior  models  for  individual 
warfighter  agents.  We  intend  to  extend  these  simulation 
models  with  data  drawn  real  tactical  network  for  the  purpose 
of  cross-validation. 
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